n8n

How to Automate Bright Data Content Capture?

Collect website content on demand and send it to your team or system. An AI agent chooses the right scraper, pulls the page in markdown or HTML, and keeps a file copy for records. This is great for marketing research, SEO checks, and content operations.

You start the run with a manual trigger. The flow sets a URL and chosen format, keeps short term memory, and uses Google Gemini to understand the request text. The agent then calls Bright Data tools through MCP to scrape either markdown or HTML. Results are posted to a webhook endpoint with a field named response, and the same content is written to disk as a file. A second path also shows a direct tool call without the agent, which helps with testing and learning.

Setup needs n8n self hosted with the community MCP client, a Google Gemini API key, and a unique webhook URL. Expect faster collection, fewer copy paste errors, and a clear audit trail of each pull. Use it to capture competitor pages, refresh content briefs, or archive landing pages for review. You can switch formats based on your next step, like sending markdown to writers or sending HTML to an internal parser.

What are the key features?

  • Manual trigger lets you run the capture on demand for testing or ad hoc pulls.
  • Set nodes define the target URL, the webhook URL, and the output format.
  • AI Agent powered by Google Gemini with short term memory to interpret requests.
  • MCP client lists available Bright Data tools and executes markdown or HTML scraping.
  • HTTP Request nodes post the scraped text to your webhook endpoint in a response field.
  • Create Binary and Write File nodes save the content to disk for records.
  • Agent text prompt selects the right tool based on your chosen format.
  • Alternate direct tool path shows simple scraping without the agent for quick checks.

What are the benefits?

  • Reduce manual copy and paste from 20 minutes to 2 minutes per page
  • Automate up to 90% of scraping steps with one click
  • Improve accuracy by removing 95% of copy and paste errors
  • Keep a file record of every scrape for audit and reuse
  • Connect AI, scraper, and webhook in one flow to cut handoffs

How do you set it up?

  1. Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
  2. You'll need accounts with Google Gemini, Bright Data MCP and Webhook.site. See the Tools Required section above for links to create accounts with these services.
  3. Use n8n self hosted and enable community nodes because the MCP Client node is a community node. Make sure the Bright Data MCP service is running on the host where n8n can reach it.
  4. In the n8n credentials manager, create a new Google Gemini credential. Get your API key from your Google AI account, paste it in, save, then select it in the Google Gemini Chat Model node.
  5. For MCP, double click any MCP Client node, choose Create new credential for MCP Client STDIO, then follow the on screen steps to point to your local MCP server. Name the credential clearly so you can reuse it across nodes.
  6. Open Webhook.site and copy your unique URL. Replace the URL in the Webhook for web scraper node and set the same value in the Set the URL with the Webhook URL and data format node.
  7. In the Set the URLs node, change the url field to the page you want to capture. In the Set the URL with the Webhook URL and data format node, choose the format as markdown or html.
  8. Confirm the AI Agent has the Google Gemini model, the memory node, and both Bright Data tool nodes connected as tools so it can choose the best one.
  9. In the Write the scraped content to disk node, set a folder path that your n8n host can write to. If running in Docker, map a volume and use that folder path.
  10. Click Test workflow to run it. Check Webhook.site for a new POST with a response field and verify the file is created in the target folder.
  11. If no content arrives, confirm the MCP Client List all tools node returns tools and the MCP server is running. If the HTTP request fails, check the webhook URL. If the file step fails, fix folder permissions or change the path.

Tools Required

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Bright Data MCP

Sign up

Free tier: $0 / mo, 5,000 requests / mo

Google Gemini

Sign up

Free tier: $0 via Gemini API; e.g., Gemini 2.5 Flash-Lite free limits 1,000 requests/day (15 RPM, 250k TPM). Paid from $0.10/1M input tokens and $0.40/1M output tokens.

Webhook.site

Sign up

Free tier: $0, public API available; free URLs expire after 7 days and accept up to 100 requests

Credits:
Source

Similar Templates

Join Futurise to access 1,200+ automation templates

Get instant access to ready-made automation workflows for n8n, Make.com, AI agents, and more. Download, customise, and deploy in minutes.