Glossary

Agent failure modes, defined

A taxonomy of every multi-agent failure mode Pisama detects, grouped by category. Each entry names the mode, explains the mechanism, gives examples, and lists detection methods with F1 scores from the calibration set.

FC1

Planning

Task specification, decomposition, resource allocation, workflow design.

FC2

Execution

Derailment, withholding, coordination, and communication breakdown.

FC3

Verification

Output validation, quality gates, completion misjudgment, retrieval quality.

EXT

Cross-cutting

Behavioral patterns that span planning, execution, and verification: loops, persona drift, hallucination, injection, state corruption.

See per-detector benchmark numbers at /benchmarks/detectors. Framework-specific detector packs at /frameworks.