What tools does this template use?

This template uses n8n and integrates with googlesheets,openai,lmstudio.

How do I set up this template?

Import the template into n8n, configure the integration credentials, and activate the workflow. Detailed step-by-step instructions are available on the template page.

n8n

How to Benchmark Google Sheets LLM Response Quality?

Compare answers from several language models in one place. Send one chat message and see how each model writes, how fast it replies, and how easy it is to read. Useful for content, support, and product teams that need clear, simple text.

A new chat message starts the run. The flow pulls the list of loaded models from your local LM server, then runs the same prompt through each model. A system prompt can guide tone and reading level. Start and end times are recorded to measure latency. A code step scores readability, counts words and sentences, and calculates averages. All results can be saved to Google Sheets for side by side review.

Setup needs LM Studio running with your chosen models and a Google Sheet if you want logs. Update the base URL to your local server and tune temperature, top p, and presence penalty to match your test plan. Most teams can cut manual comparisons from an hour to a few minutes. Expect faster evaluations and fewer copy paste steps, so you can pick the best model for FAQs, training notes, release summaries, or help center text.

What are the key features?

Chat trigger starts a new run when a chat message is received
HTTP request retrieves active model IDs from the local LM server
Split out runs the same prompt for each model separately
Set node adds a system prompt to control tone and reading level
OpenAI compatible chat node points to your local base URL and runs each model
Chain node organizes the prompt flow and collects the model output
DateTime nodes record start and end times and compute total time spent
Code node calculates readability score, word and sentence counts, and averages
Google Sheets node logs prompt, timing, model, response, and all metrics
Optional path lets you skip the Google Sheets step and review results manually

What are the benefits?

Reduce manual comparison from 60 minutes to under 5 minutes per prompt
Streamline multi model testing by about 80% with a single run
Improve data accuracy by removing copy and paste errors
Track response time per model to choose faster options
Handle all loaded models in one pass without extra setup
Connect your local LM server and Google Sheets in one evaluation

How do you set it up?

Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
You'll need accounts with Google Sheets, OpenAI and LM Studio. See the Tools Required section above for links to create accounts with these services.
Install and open LM Studio, load the models you want to test, and start the local server so the models are available at your base URL.
In the Get Models node, update the URL to your local server address, for example http://YOUR_LOCAL_IP:1234/v1/models. Run the node once to confirm you receive a list of model IDs.
Open the Run Model with Dynamic Inputs node and confirm the base URL matches your local server. Adjust temperature, top p, and presence penalty to fit your test plan.
Double click the Run Model with Dynamic Inputs node, then in the credential dropdown click Create new credential and follow the on screen instructions to integrate OpenAI.
Create a Google Sheet with headers: Prompt, Time Sent, Time Received, Total Time Spent, Model, Response, Readability Score, Average Word Length, Word Count, Sentence Count, Average Sentence Length.
Double click the Save Results to Google Sheets node, then in the credential dropdown click Create new credential and follow the on screen instructions to connect your Google account. Map each field to the matching column.
Open the n8n chat view for this workflow and send a short test prompt. Confirm that each loaded model returns a response and that timing and metrics appear in the execution data.
Check your Google Sheet for a new row. Verify that all columns are filled, including readability score, counts, and total time spent.
If you see odd or repeated behavior, clear the previous chat session, confirm the server IP and port are correct, and increase the timeout in the model and HTTP nodes if responses are slow.
Optional: If you prefer manual review, disable or remove the Google Sheets node. You can still compare outputs and metrics in the n8n execution log.

Tools Required

n8n

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Google Sheets

Free: $0 (Google Sheets API usage has no additional cost; quota limits apply)

LM Studio

Free tier: $0 / mo (local OpenAI-compatible API; free for personal and work use)

OpenAI

Pay-as-you-go: GPT-5 at $1.25 per 1M input tokens and $10 per 1M output tokens

How to Benchmark Google Sheets LLM Response Quality?

How to Benchmark Google Sheets LLM Response Quality?

What are the key features?

What are the benefits?

How do you set it up?

Tools Required

n8n

Google Sheets

LM Studio

OpenAI

Similar Templates

Automate Telegram DeepSeek Support with Google Docs Memory

Automate Google Sheets Product Price Alerts

Automate Google Drive OCR Document Processing

Join Futurise to access 1,200+ automation templates