n8n

How to Automate Google Gemini Object Detection?

Use prompts to find items inside images and draw boxes around them. Teams use it to review photos faster, check product shots, and spot issues in visual content. It is helpful for catalog teams, operations leads, and anyone who needs quick visual checks.

You start it on demand. The flow downloads an image, reads its width and height, and sends the file with a clear prompt to Google Gemini. Gemini returns bounding boxes in a simple JSON format with normalized values. The workflow then scales those values to real pixels and draws colored boxes on the original picture. Because the prompt controls what is found, you can switch from rabbits to people, cars, or store displays without code. This cuts manual clicking and reduces mistakes from eyeballing image details.

You will need a Google Gemini API key and an image link that is publicly reachable. Expect review time to drop from minutes to seconds, with consistent results across many photos. Common uses include ecommerce image quality checks, safety checks on parking or events, and fast tagging for media libraries. Edit the prompt to match your goal and reuse the same steps for many image types.

What are the key features?

  • Manual start for safe testing and quick demos
  • HTTP download of a source image from a public URL
  • Image info step reads width and height for accurate scaling
  • Google Gemini call returns bounding boxes based on your prompt
  • Set node extracts the returned box array from the response
  • Code node rescales coordinates from normalized values to pixels
  • Edit Image draws colored boxes on the original picture for easy review

What are the benefits?

  • Reduce manual image review from 5 minutes to under 30 seconds per photo
  • Automate up to 80 percent of tagging work with a single prompt
  • Avoid misaligned boxes by converting normalized values to exact pixels
  • Unify image download, AI detection, and visual markup in one flow
  • Scale to many images by swapping the trigger for a schedule or list

How do you set it up?

  1. Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
  2. You'll need accounts with Google Gemini. See the Tools Required section above for links to create accounts with these services.
  3. In the n8n credentials manager, create a new credential for Google Gemini. Choose the Google Gemini PaLM API type. Add your API key from the official API keys page and save the credential with a clear name.
  4. Open the Gemini HTTP Request node and select your new credential in the Credential to connect with field. Confirm the method is POST and the endpoint matches the model path shown in the template.
  5. Open the Get Test Image node and replace the sample image URL with your own image link. Make sure the link is public and returns an image file.
  6. Run the Get Image Info node once to confirm width and height are present in the output. If values are missing, check that the image downloaded correctly.
  7. Adjust the prompt inside the Gemini node to describe what you want to detect. Ask for a JSON response with fields like box_2d and xmin ymin xmax ymax so the parser can read the output.
  8. Click Execute to test. In the Gemini node output, confirm that the JSON text with coordinates is present. If not, refine the prompt for clearer instructions.
  9. Open the Code node and confirm the scale is set to 1000. If your model returns values from 0 to 1, change the scale to 1 to keep boxes accurate.
  10. Open the Draw Bounding Boxes node output and view the edited image. If boxes look off, verify the image width and height values and check that each operation references the correct coordinates.
  11. If you need to process many images, add a Schedule trigger or feed a list of image URLs and connect it to the download step.
  12. Troubleshoot common issues: API errors often mean the key is invalid or quota is reached, image download failures mean the URL is not public, and empty drawings often mean the coordinate indexes or scale are incorrect.

Tools Required

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Google Gemini

Sign up

Free tier: $0 via Gemini API; e.g., Gemini 2.5 Flash-Lite free limits 1,000 requests/day (15 RPM, 250k TPM). Paid from $0.10/1M input tokens and $0.40/1M output tokens.

Similar Templates

Join Futurise to access 1,200+ automation templates

Get instant access to ready-made automation workflows for n8n, Make.com, AI agents, and more. Download, customise, and deploy in minutes.