Turn plain chat requests into smart, private answers. A user types a message, and the system picks the best local model to respond. It is ideal for teams that want fast, accurate replies without sending data to the cloud.
Here is how it works. A chat trigger listens for each new message. A routing agent reviews the prompt and decides which local model fits the job, such as a text model for writing, a coder model for programming help, or a vision model for image input. The choice is passed to a dynamic model node that loads that exact model. A second agent then answers using that model. Two memory nodes keep context across messages so the conversation stays on track. All processing happens through your local Ollama service.
Setup is simple if you already run Ollama. Pull the models you plan to use and point the credentials in n8n to your Ollama endpoint. Expect faster answers, fewer wrong model choices, and stronger data control. Great for internal help desks, code review chats, and image analysis in secure teams.