Ext · Cross-cutting
Hallucination
Detects factual inaccuracies, fabricated information, and unsupported claims in agent outputs.
Examples
- Agent cites a research paper that doesn't exist
- Agent states "definitely" and "proven fact" about unverifiable claims
- Agent fabricates statistics without any source documents to ground them
- Agent provides detailed product information that contradicts the source data
Detection methods
- Grounding Score
- Measures output alignment against source documents
- Citation Verification
- Checks for and validates citation patterns
- Confidence Analysis
- Flags definitive claims without hedging
- Source Comparison
- Semantic similarity between claims and available sources
Calibration accuracy
F1
0.772
Precision
0.718
Recall
0.836
From the Pisama calibration set. See detector scoreboard for the full table.
Detect this in production with the framework adapters (LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Claude Agent SDK, n8n, Dify). See the full taxonomy at /taxonomy.