Turn images and PDFs into clear tags, captions, and summaries using Google Gemini. Great for marketing teams that manage many assets and need fast, consistent results across different media types. Pick the method that fits your task, from quick single image checks to full control API calls.
The flow starts on manual run and branches into five paths. One path sends a single image straight to an AI agent with binary passthrough for the fastest setup. Another path processes multiple images with custom prompts and loops through each item. A third path follows the standard n8n item model, converts files to base64, and calls Gemini directly. The fourth path fetches a PDF, converts it to base64, and asks Gemini for a summary. The fifth path does the same for a single image via a custom API call. You can filter inputs, split data, and control prompts per item.
You need a Google Gemini API key and credentials in n8n. Add your image URLs and prompts in the Set nodes, or point to your PDFs. Run a branch, check the output text, and adjust your instructions for better tags. Expect faster media review, more consistent labels, and less manual copywriting work. Useful for product catalogs, social content planning, brand audits, and document summaries.