Free template · n8n
AI Invoice Parser for n8n: Gmail PDF to Google Sheets
invoice-parser-n8n.json
Manually retyping invoices into a spreadsheet is the kind of work that should have died years ago. This n8n workflow watches your inbox for PDF invoices, extracts the structured data with an LLM, and only interrupts a human when something looks wrong.
What this workflow does
- Gmail trigger polls for unread emails with PDF attachments matching invoice keywords, and downloads the attachment
- Extract from File node converts the PDF to raw text
- OpenAI node performs structured extraction:
vendor,invoice_number,invoice_date,total,currency, and fullline_items - The same prompt runs built-in anomaly checks: missing or zero total, line items that don’t sum to the total, future-dated invoices, or documents that aren’t invoices at all
- IF node branches on the anomaly flag
- Anomaly: Slack message to
#finance-reviewwith the reason and the source email - Clean: row appended to a Google Sheets “Invoices” tab, including the line items as JSON
The extraction prompt enforces ISO dates, numeric totals without currency symbols, and 3-letter currency codes, so the spreadsheet stays consistent without cleanup formulas.
Prerequisites
- n8n instance (the Extract from File PDF operation is built in, no extra packages)
- Gmail OAuth2 credentials for the mailbox that receives invoices
- OpenAI API key (
gpt-4o-miniis enough for text-based PDFs) - Slack bot token with access to your finance channel
- Google Sheets credentials and a spreadsheet with an “Invoices” tab
One honest limitation: the PDF text extraction works on digital PDFs. Scanned image-only invoices need an OCR step first — swap the Extract node for an OCR API call if that’s your situation.
How to import
- Download the JSON from the box above.
- In n8n: Workflows → Import from File, pick the file.
- Replace the four credential placeholders (
YOUR_GMAIL_CREDENTIAL,YOUR_OPENAI_CREDENTIAL,YOUR_SLACK_CREDENTIAL,YOUR_GOOGLE_SHEETS_CREDENTIAL). - Set your spreadsheet ID, then activate. Forward yourself a real invoice to test the full path.
What to customize
- The Gmail search query — the default matches
has:attachment filename:pdfplus invoice keywords in three languages; tighten it to a specific sender or label if vendors are predictable - Anomaly rules in the system prompt — add your own, e.g. “flag any invoice over 5,000 EUR” or “flag unknown vendors”
- Sheet columns — the mapping is explicit in the Google Sheets node, so add a cost-center or approval column easily
- Slack channel and message format
For the broader pattern, including approval steps and accounting-tool handoff, read the invoice processing automation tutorial.
Cost per run
A typical one-page invoice produces 1,000-2,500 tokens of PDF text plus a few hundred tokens of output. With gpt-4o-mini, as of mid-2026, that’s well under a cent per invoice — processing 500 invoices a month should cost on the order of a dollar or two. Long multi-page invoices cost proportionally more, so consider truncating terms-and-conditions pages if your vendors are verbose.
Try it yourself
n8n
n8n handles the PDF extraction natively and doesn't bill per operation, so a high-volume invoice inbox costs you OpenAI tokens and nothing else.
Start with n8n