n8n

How to Benchmark Google Sheets LLM Response Quality?

Compare answers from several language models in one place. Send one chat message and see how each model writes, how fast it replies, and how easy it is to read. Useful for content, support, and product teams that need clear, simple text.

A new chat message starts the run. The flow pulls the list of loaded models from your local LM server, then runs the same prompt through each model. A system prompt can guide tone and reading level. Start and end times are recorded to measure latency. A code step scores readability, counts words and sentences, and calculates averages. All results can be saved to Google Sheets for side by side review.

Setup needs LM Studio running with your chosen models and a Google Sheet if you want logs. Update the base URL to your local server and tune temperature, top p, and presence penalty to match your test plan. Most teams can cut manual comparisons from an hour to a few minutes. Expect faster evaluations and fewer copy paste steps, so you can pick the best model for FAQs, training notes, release summaries, or help center text.

What are the key features?

  • Chat trigger starts a new run when a chat message is received
  • HTTP request retrieves active model IDs from the local LM server
  • Split out runs the same prompt for each model separately
  • Set node adds a system prompt to control tone and reading level
  • OpenAI compatible chat node points to your local base URL and runs each model
  • Chain node organizes the prompt flow and collects the model output
  • DateTime nodes record start and end times and compute total time spent
  • Code node calculates readability score, word and sentence counts, and averages
  • Google Sheets node logs prompt, timing, model, response, and all metrics
  • Optional path lets you skip the Google Sheets step and review results manually

What are the benefits?

  • Reduce manual comparison from 60 minutes to under 5 minutes per prompt
  • Streamline multi model testing by about 80% with a single run
  • Improve data accuracy by removing copy and paste errors
  • Track response time per model to choose faster options
  • Handle all loaded models in one pass without extra setup
  • Connect your local LM server and Google Sheets in one evaluation

How do you set it up?

  1. Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
  2. You'll need accounts with Google Sheets, OpenAI and LM Studio. See the Tools Required section above for links to create accounts with these services.
  3. Install and open LM Studio, load the models you want to test, and start the local server so the models are available at your base URL.
  4. In the Get Models node, update the URL to your local server address, for example http://YOUR_LOCAL_IP:1234/v1/models. Run the node once to confirm you receive a list of model IDs.
  5. Open the Run Model with Dynamic Inputs node and confirm the base URL matches your local server. Adjust temperature, top p, and presence penalty to fit your test plan.
  6. Double click the Run Model with Dynamic Inputs node, then in the credential dropdown click Create new credential and follow the on screen instructions to integrate OpenAI.
  7. Create a Google Sheet with headers: Prompt, Time Sent, Time Received, Total Time Spent, Model, Response, Readability Score, Average Word Length, Word Count, Sentence Count, Average Sentence Length.
  8. Double click the Save Results to Google Sheets node, then in the credential dropdown click Create new credential and follow the on screen instructions to connect your Google account. Map each field to the matching column.
  9. Open the n8n chat view for this workflow and send a short test prompt. Confirm that each loaded model returns a response and that timing and metrics appear in the execution data.
  10. Check your Google Sheet for a new row. Verify that all columns are filled, including readability score, counts, and total time spent.
  11. If you see odd or repeated behavior, clear the previous chat session, confirm the server IP and port are correct, and increase the timeout in the model and HTTP nodes if responses are slow.
  12. Optional: If you prefer manual review, disable or remove the Google Sheets node. You can still compare outputs and metrics in the n8n execution log.

Tools Required

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Google Sheets

Sign up

Free: $0 (Google Sheets API usage has no additional cost; quota limits apply)

LM Studio

Sign up

Free tier: $0 / mo (local OpenAI-compatible API; free for personal and work use)

OpenAI

Sign up

Pay-as-you-go: GPT-5 at $1.25 per 1M input tokens and $10 per 1M output tokens

Similar Templates

Join Futurise to access 1,200+ automation templates

Get instant access to ready-made automation workflows for n8n, Make.com, AI agents, and more. Download, customise, and deploy in minutes.