Glossary

Agent failure modes, defined

Name: Pisama
Author: Pisama

A taxonomy of every agent failure mode Pisama detects across single-agent, multi-agent, and sub-agent runs, grouped by category. Each entry names the mode, explains the mechanism, gives examples, and lists detection methods with F1 scores from the calibration set.

FC1

Planning

Task specification, decomposition, resource allocation, workflow design.

Specification Mismatch
F1 0.70
Detects when task output doesn't match the user's original specification. Catches scope drift, missing requirements, language mismatches, and conflicting specifications.
Poor Task Decomposition
F1 0.73
Detects when task breakdown creates subtasks that are impossible, circular, vague, too granular, or too broad. Critical for complex multi-step agent workflows.
Resource Misallocation
Detects when multiple agents compete for shared resources, leading to contention, starvation, or deadlock. Common in parallel multi-agent architectures.
Inadequate Tool Provision
Detects when agents lack the tools needed to complete assigned tasks. Catches hallucinated tool names, missing capabilities, and suboptimal workarounds.
Flawed Workflow Design
F1 0.80
Detects structural problems in agent workflow graphs including unreachable nodes, dead ends, missing error handling, bottlenecks, and missing termination conditions.

FC2

Execution

Derailment, withholding, coordination, and communication breakdown.

FC3

Verification

Output validation, quality gates, completion misjudgment, retrieval quality.

EXT

Cross-cutting

Behavioral patterns that span planning, execution, and verification: loops, persona drift, hallucination, injection, state corruption.

See per-detector benchmark numbers at /benchmarks/detectors. Framework-specific detector packs at /frameworks.