Ext · Cross-cutting

Hallucination

Detects factual inaccuracies, fabricated information, and unsupported claims in agent outputs.

Examples

  • Agent cites a research paper that doesn't exist
  • Agent states "definitely" and "proven fact" about unverifiable claims
  • Agent fabricates statistics without any source documents to ground them
  • Agent provides detailed product information that contradicts the source data

Detection methods

Grounding Score
Measures output alignment against source documents
Citation Verification
Checks for and validates citation patterns
Confidence Analysis
Flags definitive claims without hedging
Source Comparison
Semantic similarity between claims and available sources

Calibration accuracy

F1
0.772
Precision
0.718
Recall
0.836

From the Pisama calibration set. See detector scoreboard for the full table.

Detect this in production with the framework adapters (LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Claude Agent SDK, n8n, Dify). See the full taxonomy at /taxonomy.