Give your users fast answers from technical docs without manual lookup. This build turns long API documents into quick chat answers. It fits teams that support developers, product partners, or internal engineers.
The flow has two parts. First, a manual run pulls a public JSON spec file with an HTTP request, splits the text into small chunks, creates embeddings with OpenAI, and writes them into a Pinecone index. Second, a chat trigger listens for a user message. The AI Agent uses an OpenAI chat model, a vector store tool, and short term memory. It converts the question into an embedding, searches Pinecone for the best matching chunks, and writes a clear reply. Two chat models separate planning and response. The result is a focused RAG chat that stays grounded in source text.
You will need OpenAI and Pinecone accounts and an index named n8n demo or a name you choose. Expect lower support load, faster replies, and fewer escalations to engineers. This setup works well for API portals, internal enablement, and onboarding labs where accurate, sourced answers matter.