Turn a public article site into a simple research chat. New content is scraped, stored in a vector database, and ready for question and answer with clear citations. Great for teams that need fast insights from long posts without reading everything.
The flow starts with a manual run that collects the article list, pulls the first few pages, removes HTML, and splits text into large chunks. Embeddings are created with OpenAI and loaded into a Milvus collection for fast search. A chat trigger then listens for a question, builds a fresh embedding of the query, fetches the top matches from Milvus, and prepares a short context pack. A language model answers only from that context and returns links as citations. Chunk limits and top results settings control speed and cost.
Setup needs a running Milvus server and an OpenAI API key. You can keep the default collection name or pick your own. Update the CSS selector if your source site layout is different. Expect big time savings for research tasks, faster onboarding for new team members, and clear sources you can trust. Run the load step once, then ask questions anytime through the chat endpoint.