AutoFlowLab
← Templates

AI Invoice Parser for n8n: Gmail PDF to Google Sheets

n8n template that watches Gmail for PDF invoices, extracts vendor, totals and line items with AI, flags anomalies in Slack and logs clean data to Sheets.

April 25, 2026 · n8n template

Free template · n8n

AI Invoice Parser for n8n: Gmail PDF to Google Sheets

invoice-parser-n8n.json

Download JSON

Manually retyping invoices into a spreadsheet is the kind of work that should have died years ago. This n8n workflow watches your inbox for PDF invoices, extracts the structured data with an LLM, and only interrupts a human when something looks wrong.

What this workflow does

  • Gmail trigger polls for unread emails with PDF attachments matching invoice keywords, and downloads the attachment
  • Extract from File node converts the PDF to raw text
  • OpenAI node performs structured extraction: vendor, invoice_number, invoice_date, total, currency, and full line_items
  • The same prompt runs built-in anomaly checks: missing or zero total, line items that don’t sum to the total, future-dated invoices, or documents that aren’t invoices at all
  • IF node branches on the anomaly flag
  • Anomaly: Slack message to #finance-review with the reason and the source email
  • Clean: row appended to a Google Sheets “Invoices” tab, including the line items as JSON

The extraction prompt enforces ISO dates, numeric totals without currency symbols, and 3-letter currency codes, so the spreadsheet stays consistent without cleanup formulas.

Prerequisites

  • n8n instance (the Extract from File PDF operation is built in, no extra packages)
  • Gmail OAuth2 credentials for the mailbox that receives invoices
  • OpenAI API key (gpt-4o-mini is enough for text-based PDFs)
  • Slack bot token with access to your finance channel
  • Google Sheets credentials and a spreadsheet with an “Invoices” tab

One honest limitation: the PDF text extraction works on digital PDFs. Scanned image-only invoices need an OCR step first — swap the Extract node for an OCR API call if that’s your situation.

How to import

  1. Download the JSON from the box above.
  2. In n8n: Workflows → Import from File, pick the file.
  3. Replace the four credential placeholders (YOUR_GMAIL_CREDENTIAL, YOUR_OPENAI_CREDENTIAL, YOUR_SLACK_CREDENTIAL, YOUR_GOOGLE_SHEETS_CREDENTIAL).
  4. Set your spreadsheet ID, then activate. Forward yourself a real invoice to test the full path.

What to customize

  • The Gmail search query — the default matches has:attachment filename:pdf plus invoice keywords in three languages; tighten it to a specific sender or label if vendors are predictable
  • Anomaly rules in the system prompt — add your own, e.g. “flag any invoice over 5,000 EUR” or “flag unknown vendors”
  • Sheet columns — the mapping is explicit in the Google Sheets node, so add a cost-center or approval column easily
  • Slack channel and message format

For the broader pattern, including approval steps and accounting-tool handoff, read the invoice processing automation tutorial.

Cost per run

A typical one-page invoice produces 1,000-2,500 tokens of PDF text plus a few hundred tokens of output. With gpt-4o-mini, as of mid-2026, that’s well under a cent per invoice — processing 500 invoices a month should cost on the order of a dollar or two. Long multi-page invoices cost proportionally more, so consider truncating terms-and-conditions pages if your vendors are verbose.

Try it yourself

n8n

n8n handles the PDF extraction natively and doesn't bill per operation, so a high-volume invoice inbox costs you OpenAI tokens and nothing else.

Start with n8n