n8n

How to Automate Ollama Model Selection?

Turn plain chat requests into smart, private answers. A user types a message, and the system picks the best local model to respond. It is ideal for teams that want fast, accurate replies without sending data to the cloud.

Here is how it works. A chat trigger listens for each new message. A routing agent reviews the prompt and decides which local model fits the job, such as a text model for writing, a coder model for programming help, or a vision model for image input. The choice is passed to a dynamic model node that loads that exact model. A second agent then answers using that model. Two memory nodes keep context across messages so the conversation stays on track. All processing happens through your local Ollama service.

Setup is simple if you already run Ollama. Pull the models you plan to use and point the credentials in n8n to your Ollama endpoint. Expect faster answers, fewer wrong model choices, and stronger data control. Great for internal help desks, code review chats, and image analysis in secure teams.

What are the key features?

  • Chat trigger captures each user message and starts the flow.
  • Routing agent analyzes the prompt and picks the best local model.
  • Dynamic model loader uses the router choice to select the exact Ollama model.
  • Answer agent generates the reply using the chosen local model.
  • Two chat memories keep context for both routing and answering using a session key.
  • Local Ollama API handles all inference to keep data on your machines.

What are the benefits?

  • Reduce model selection time from minutes to seconds
  • Cut wrong model choices by up to 80% through prompt routing
  • Handle 3 times more chat sessions with the same team
  • Keep all data local to protect sensitive information
  • Carry context across messages to reduce follow ups by 30%

How do you set it up?

  1. Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
  2. You'll need accounts with Ollama. See the Tools Required section above for links to create accounts with these services.
  3. Install and run Ollama on the same host that n8n can reach. Confirm the API is available at http://127.0.0.1:11434 or your chosen host and port.
  4. Open a terminal and pull your models. For example: ollama pull phi4, and pull any other models you plan to route such as text, coder, or vision models.
  5. In n8n, double click the Ollama nodes. In the Credential to connect with dropdown, click Create new credential and follow the on screen instructions. Set the base URL to your Ollama API endpoint.
  6. Open the LLM Router node and review the system message. Add or remove model names so they match the models you have pulled in Ollama.
  7. Check the Dynamic LLM node expression. It should read the model name from the router output so the next agent loads the right model.
  8. Verify the two Memory nodes use a session key from the chat trigger. This keeps context per user or session.
  9. Activate the workflow. In the n8n chat interface, send a general question, a coding task, and an image analysis prompt to confirm the router picks different models.
  10. If a model is not found, pull it with the Ollama CLI and try again. If n8n cannot reach Ollama, check the host and port or expose the service to a reachable URL.

Tools Required

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Ollama

Sign up

Free tier: $0 (self-hosted local API)

Similar Templates

Join Futurise to access 1,200+ automation templates

Get instant access to ready-made automation workflows for n8n, Make.com, AI agents, and more. Download, customise, and deploy in minutes.