A small team’s support queue has a brutal shape: roughly 60–70% of tickets are questions your FAQ already answers — password resets, shipping times, plan limits — and the remaining third is where the actual customer relationships are won or lost. Humans burn their day on the first pile and answer the second pile tired.

This tutorial builds a triage bot in n8n that answers the repetitive pile from your knowledge base, classifies every ticket’s urgency, and escalates anything complex, angry, or money-related to a human in Slack with full context already attached. The hard part isn’t the plumbing — it’s the guardrails, and we’ll spend real time there: a confidence threshold the bot must clear before it’s allowed to reply, a hard rule that it never invents refund promises, and an audit trail for every automated response.

This is an advanced build. If you haven’t done an LLM-routing workflow before, run through the lead capture tutorial first — this uses the same skeleton with much higher stakes, because here the LLM’s output goes to a customer.

Architecture

Webhook (helpdesk/email trigger)
  → OpenAI #1: classify (category, urgency, sentiment)
  → IF: auto-answerable category AND not angry AND no money involved?
      true  → Build KB context (static FAQ or vector store lookup)
              → OpenAI #2: draft answer + confidence, FAQ-grounded
              → IF: confidence ≥ 0.8 AND passed guardrail checks?
                  true  → Send reply + log + tag "ai-answered"
                  false → Escalate
      false → Escalate
Escalate = Slack #support-escalations with classification,
           draft answer (if any), and full ticket context

Two LLM calls, deliberately separated: a classifier that decides whether the bot may answer, and an answerer that drafts the reply. Collapsing them into one prompt is tempting and wrong — the classifier must be able to veto independently, and you’ll tune the two prompts on different schedules.

You’ll need: n8n, an OpenAI API key, your helpdesk (we’ll use Freshdesk’s webhook; Zendesk, Help Scout, and a plain Gmail inbox all work the same way), Slack, and your FAQ content.

The wired-up workflow, including all prompts and guardrail nodes:

Free template · n8n

AI Support Triage Bot

support-triage-bot-n8n.json

Download JSON

Step 1: Ticket intake

Add a Webhook node, method POST, path support-intake-x91k, Respond Immediately — helpdesk webhooks typically time out in 5–10 seconds, far less than two LLM calls take.
In Freshdesk: Admin → Automations → Ticket Creation rule → “Trigger webhook,” POST to your production URL with ticket ID, subject, description, requester email. For a Gmail-based queue, swap in a Gmail Trigger filtered to your support address.
Add a Set node normalizing whatever arrives into four fields: ticket_id, customer_email, subject, body. Every downstream node references only these, so changing helpdesks later is a one-node edit.

Step 2: The classifier

OpenAI node, temperature 0, JSON mode on. A small fast model is fine here — classification is the easy half. System message:

You are a support ticket classifier for [YOUR COMPANY], which sells
[YOUR PRODUCT]. Return ONLY JSON:

{
  "category": "faq | account_access | billing | bug_report | feature_request | refund_or_cancellation | legal_or_complaint | other",
  "urgency": "low | medium | high | critical",
  "sentiment": "calm | frustrated | angry",
  "summary": "one sentence",
  "auto_answer_eligible": boolean,
  "eligibility_reason": "one short sentence"
}

auto_answer_eligible is true ONLY if ALL of these hold:
- category is "faq" or "account_access"
- sentiment is "calm" or "frustrated" (never "angry")
- the ticket does not mention refunds, chargebacks, cancellation,
  legal action, data deletion, or a specific amount of money
- the ticket asks a question rather than reporting an outage or bug
- urgency is "low" or "medium"

When in doubt on any condition, set it false. A human reading an
easy ticket costs little; a bot mishandling a hard one costs a
customer. critical urgency = outage, security concern, or threat
to cancel; reserve it.

User message:

Subject: {{ $json.subject }}
From: {{ $json.customer_email }}
Body:
{{ $json.body }}

Follow with the standard parse-guard Code node (strip code fences, try/catch JSON.parse, and on failure emit auto_answer_eligible: false — a parse failure must fail toward escalation, never toward auto-reply). Then an IF node on {{ $json.auto_answer_eligible }}.

Step 3: Ground the answer in your knowledge base

Two options depending on FAQ size:

Under ~6,000 words of FAQ: skip retrieval entirely. Paste the whole FAQ into the answerer’s system prompt. Context windows are huge, the per-call cost difference is cents (as of mid-2026, check current pricing), and you eliminate an entire class of retrieval failures. Most small businesses should start here and feel no shame.

Larger KB: use n8n’s vector store nodes. Build a one-time indexing workflow — Notion/HTTP → Default Data Loader → Recursive Character Text Splitter (chunk 800, overlap 100) → Embeddings OpenAI → Supabase/Pinecone Vector Store: Insert. In the main flow, add a Vector Store: Get Many (top 4 results) querying on {{ $json.summary + ' ' + $json.body }}, then concatenate the chunks into a kb_context field with a Set node.

Step 4: The answerer — where the guardrails live

OpenAI node, temperature 0.2, JSON mode, your best available mid-tier model. This prompt is the heart of the build; copy it whole:

You are a support agent for [YOUR COMPANY]. Draft a reply to the
customer using ONLY the knowledge base below. Return ONLY JSON:

{
  "answer": "the reply, plain text, friendly and concise",
  "confidence": number 0-1,
  "kb_sections_used": ["headings of the KB sections you relied on"],
  "refused": boolean,
  "refusal_reason": "string or null"
}

HARD RULES — violating any of these means set refused=true instead
of answering:
1. If the knowledge base does not directly answer the question, do
   NOT answer from general knowledge. Refuse.
2. NEVER promise, imply, or estimate a refund, credit, discount, or
   compensation of any kind. Even if the KB mentions the refund
   policy, do not apply it to this customer's case — refuse and let
   a human decide.
3. NEVER commit to dates, SLAs, or roadmap items not stated verbatim
   in the KB.
4. NEVER ask the customer for passwords, card numbers, or 2FA codes.
5. If the customer claims something that contradicts the KB, do not
   argue; refuse.

Confidence rubric:
- 0.9+: the KB answers this exact question explicitly
- 0.7-0.89: the KB covers it but you had to combine sections
- below 0.7: any inference or gap-filling was needed

Style: 2-5 sentences, no corporate filler, sign off as
"[COMPANY] Support". Do not mention that you are an AI unless
company policy requires disclosure (if so, one short line at the end).

KNOWLEDGE BASE:
{{ $json.kb_context }}

User message: the same subject/body block from Step 2.

The deterministic guardrail layer

Prompt rules stop most failures; code stops the rest. Never let “the prompt says don’t” be your only defense on money-related language. Add a Code node:

const r = $input.first().json; // parsed answerer output
const answer = (r.answer || '').toLowerCase();

const forbidden = [
  /refund/i, /credit (your|the) account/i, /discount/i, /compensat/i,
  /we (will|can) waive/i, /free month/i, /reimburse/i, /chargeback/i,
  /\$\d/, /€\d/, /£\d/
];
const tripwire = forbidden.some(p => p.test(answer));

const approved = !r.refused && !tripwire && (r.confidence ?? 0) >= 0.8;

return [{ json: { ...$('Normalize').first().json, ...r,
  guardrail_tripped: tripwire, approved,
  escalation_reason: r.refused ? r.refusal_reason
    : tripwire ? 'Answer contained forbidden money language'
    : (r.confidence ?? 0) < 0.8 ? `Confidence ${r.confidence} below 0.8`
    : null } }];

The regex list looks paranoid. It is. In production, the classifier occasionally lets a billing-adjacent ticket through (“how do I update my card?” is legitimately FAQ territory), and the answerer occasionally gets helpful about it. The tripwire is what stands between “occasionally” and “a customer screenshot of your bot promising a refund.”

IF node on {{ $json.approved }}.

Step 5: Reply, log, escalate

Approved branch:

Freshdesk node (or Gmail Reply): add a reply to {{ $json.ticket_id }} with {{ $json.answer }}. Keep the ticket open, not resolved — let the customer’s silence close it. Auto-resolving is how you teach customers the bot is a wall.
Tag the ticket ai-answered and append a private note with confidence and kb_sections_used.
Google Sheets append: timestamp, ticket ID, category, confidence, answer. This audit log is non-negotiable — it’s your tuning dataset and your accountability record.

Escalation branch (and the false paths from Steps 2 and 4):

Slack node → #support-escalations:

🎫 {{ $json.urgency === 'critical' ? '🚨 CRITICAL — ' : '' }}{{ $json.category }} | {{ $json.sentiment }}
{{ $json.summary }}
Why escalated: {{ $json.escalation_reason ?? $json.eligibility_reason }}
{{ $json.answer ? 'Draft answer (NOT sent, confidence ' + $json.confidence + '): ' + $json.answer : 'No draft.' }}
Ticket: https://yourco.freshdesk.com/a/tickets/{{ $json.ticket_id }}

Including the unsent draft is the sleeper feature: agents report that editing a 0.7-confidence draft is faster than writing from scratch, so even “failed” automation pays.

Try it yourself

n8n

Layered guardrails — prompt rules, confidence gates, regex tripwires — need real branching and code. This is exactly what n8n is for.

Start with n8n

The same pattern in Zapier

A trimmed version works in Zapier: helpdesk trigger → OpenAI classify → Filter on eligibility → OpenAI answer (FAQ pasted into the prompt) → Code by Zapier for the tripwire regex and confidence gate → Paths for reply vs. Slack escalation. It’s workable for low volume, with two honest caveats: Paths nesting limits make the multi-layer gating clumsy, and at 5–6 tasks per ticket the cost per ticket adds up fast on real queues (as of mid-2026, check current pricing). Use Zapier when ticket volume is low and your FAQ is small; move to n8n when either grows.

Make sits between: routers model the branching well and per-operation pricing is friendlier than Zapier’s, but the guardrail code lands in awkward formula chains. For a support bot specifically, n8n’s Code nodes win. Full platform comparison: Make vs Zapier vs n8n.

Common errors and fixes

Helpdesk webhook retries create duplicate replies. Your workflow took longer than the webhook timeout, the helpdesk retried, and two executions answered one ticket. Respond Immediately (Step 1), and add an idempotency check: look up the ticket ID in your Sheets log at the top and exit if it’s already there.

Classifier marks angry tickets “calm”. Sarcasm. Add two or three real examples from your queue to the classifier prompt as few-shot cases — sentiment accuracy jumps noticeably with even minimal examples from your actual customers.

Answerer cites KB sections that don’t exist. Classic hallucinated grounding. Validate kb_sections_used in the guardrail Code node against the actual headings you injected; unknown headings should flip approved to false.

Vector store returns irrelevant chunks. Usually chunking, not embeddings: chunks split mid-topic. Re-index with splits on headings rather than fixed character counts, and raise top-k from 4 to 6.

OpenAI 429 during a ticket storm. An outage on your end creates a burst of inbound tickets exactly when rate limits bite. Enable queue mode (self-hosted) or Loop Over Items batching, Retry on Fail with 10s backoff — and accept that during incidents most tickets will escalate, which is correct behavior anyway.

Malformed JSON from either LLM call. Both parse guards must default toward escalation. Audit this explicitly: search your log for parse failures and confirm each one escalated rather than replied.

Slack escalations rate-limited in an incident. chat.postMessage allows roughly one message per second per channel. Batch escalations into a digest when more than 10 arrive in a minute.

Launch protocol

Don’t flip this on cold. Week one: run in shadow mode — disconnect the reply node, log what the bot would have sent, and have an agent grade 50 of them. Week two: enable auto-reply for the faq category only, threshold at 0.85. Then widen category by category, lowering the threshold only when the audit log earns it. The bot’s job is never to handle every ticket — it’s to make sure no human ever answers the same password-reset question twice while an angry enterprise customer waits in the queue.

Build an AI Customer Support Bot With n8n: Triage and Escalate