n8n

How to Automate Google Drive Statement Data Extraction?

Turn bank statements into clean text you can use. The flow pulls a PDF, turns each page into clear markdown, and then extracts only the deposit lines. It suits finance teams that need fast reconciliation without manual typing.

You start it from a manual button. A Google Drive step downloads the statement. An HTTP Request sends the PDF to a PDF service that returns images in a zip. The workflow unzips the file, lists and sorts the images by file name, and resizes them for faster AI processing. A Google Gemini vision model reads each page and writes markdown. An aggregate step joins all pages into one file. An information extractor then returns deposit rows as structured data.

Setup is simple. You need access to a Drive file, an API key for the Gemini model, and a PDF convert endpoint. Expect data entry time to drop from hours to minutes, especially for scanned statements where normal text extraction fails. Use it for month end close, audit support, and to push deposit data into your ledger or ERP.

What are the key features?

  • Manual start for controlled runs and safe testing
  • Google Drive download by file ID to fetch the statement PDF
  • HTTP Request converts the PDF to page images using a PDF service
  • Zip extraction to get all page images in one step
  • Code node builds a clean list of image items for processing
  • Sort node orders pages by file name for correct sequence
  • Image resize reduces file size to speed up AI processing
  • Google Gemini vision model transcribes each page to markdown
  • Aggregate node combines all page markdown into one document
  • Information extractor pulls only deposit table rows from the transcript

What are the benefits?

  • Reduce manual review from 60 minutes to 5 minutes per statement
  • Automate up to 90 percent of deposit data entry
  • Improve accuracy on scanned statements by 30 percent compared to manual typing
  • Handle multi page and scanned PDFs in one run
  • Connect Google Drive and an AI model in a single flow
  • Scale to hundreds of pages without changing the process

How do you set it up?

  1. Import the template into n8n: Create a new workflow in n8n > Click the three dots menu > Select 'Import from File' > Choose the downloaded JSON file.
  2. You'll need accounts with Google Drive, Google Gemini and Stirling PDF. See the Tools Required section above for links to create accounts with these services.
  3. In the n8n editor, double click the Get Bank Statement node. In the Credential to connect with dropdown, click Create new credential and follow the on screen steps to connect your Google Drive account.
  4. In Get Bank Statement, set Operation to download and replace the file ID with your bank statement file ID in Google Drive.
  5. Open the Split PDF into Images node. If you use a hosted PDF service, change the URL to their convert pdf to image endpoint. Keep method as POST and content type as multipart form data.
  6. Run only the Split PDF into Images node. Confirm the response contains a binary zip file.
  7. Run Extract Zip File and check that image files are present in the output.
  8. Open Images To List and Sort Pages. Confirm that items are listed and sorted by file name so pages are in the right order.
  9. Open Resize Images For AI and set a percent that fits your needs. Start with 70 percent to balance speed and quality.
  10. Double click the Google Gemini Chat Model and Google Gemini Chat Model1 nodes. In each, choose Create new credential, add your API key from the Google AI site, and save. Keep the model set to gemini 1.5 pro latest.
  11. Run the workflow from the Test button. Check Transcribe to Markdown for clear text with tables, then check Combine All Pages for a single markdown document.
  12. Open Extract All Deposit Table Rows and view the output. You should see a list of deposit rows ready for export or further steps.
  13. If pages are out of order, adjust the Sort Pages node. If timeouts occur, lower the image resize percent or process fewer pages at a time. If the model fails, confirm your API key, quotas, and network access.

Tools Required

$24 / mo or $20 / mo billed annually to use n8n in the cloud. However, the local or self-hosted n8n Community Edition is free.

Google Drive

Sign up

Drive API: $0 (no additional cost; quota-limited)

Google Gemini

Sign up

Free tier: $0 via Gemini API; e.g., Gemini 2.5 Flash-Lite free limits 1,000 requests/day (15 RPM, 250k TPM). Paid from $0.10/1M input tokens and $0.40/1M output tokens.

Stirling PDF

Sign up

Free: $0 / mo (self-hosted), includes CLI + API access; up to 5 users

Similar Templates

Join Futurise to access 1,200+ automation templates

Get instant access to ready-made automation workflows for n8n, Make.com, AI agents, and more. Download, customise, and deploy in minutes.