Knowledge base

RAG retrieval, per-turn

The Knowledge tab in the agent editor is where you paste (or import) facts the agent should KNOW but isn't part of its persona: pricing, product specs, FAQ, return policy, links to share.

How retrieval works

On save, the worker chunks your knowledge into ~500-token pieces and embeds each with voyage-3 (1024-dim vectors). At call time, every turn does a kNN search against the embedding index and injects the top-N most-relevant chunks into the model context.

That means: be detailed. The agent only sees the chunks relevant to the current question, not the full corpus — so long, specific docs are fine. Extra detail ≠ extra latency, because only N chunks actually reach the LLM per turn.

Import

The Knowledge tab has Replace / Append buttons. Pick a .txt or .md file — content lands in the textarea, save triggers re-embed. PDF import is on the roadmap; for now extract text from your PDF first (or paste it).

Structure tips

Use headings. Markdown ## and ### give the chunker natural break points.
One topic per paragraph. Embedding similarity is per-chunk; mixed paragraphs get muddier vectors.
Link to authoritative sources. If you ship "Pricing: see /pricing", the agent can read the page name and the live URL on the call.
Re-embed isn't instant. The kb-embed worker job typically completes within 60s for a typical KB; very large updates can take a few minutes.

Verifying retrieval

Run a test call (Wizard → Test call) and ask the agent a question that only the KB knows. If the answer is generic, the chunk isn't being retrieved — usually means the question is too vague to match. Make the KB headings more specific.