Need a faster way to turn API docs into structured data? This workflow finds developer pages, reads them, extracts API operations, and builds custom schemas you can share with your team. It is useful for product, integration, and IT teams that track many vendors.
The run starts from a manual or event trigger, pulls a list from Google Sheets, and uses Apify to search and scrape web pages. Results are cleaned and deduped, then split into readable chunks and stored in Qdrant as embeddings. Google Gemini checks if the content is real API documentation, identifies products, and extracts the endpoints and methods. Operations are saved back to Google Sheets. A code step groups the operations into a clear JSON schema and uploads the file to Google Drive. Status nodes and wait steps manage research, extract, and generate phases with clear success and error paths.
Set up requires accounts for Apify, Google Sheets and Drive, Qdrant, and Google Gemini. Add credentials in n8n, point nodes to your sheet and Drive folder, and confirm your Qdrant collection. Expect large time savings, consistent outputs, and a reusable API knowledge base. Great for vendor reviews, partner onboarding, and building internal API catalogs at scale.