Name: Pisama
Author: Pisama

Question 1

What is task derailment in AI agent systems?

Accepted Answer

Detects when an agent goes off-topic or deviates from its assigned task. One of the most common failure modes (20% prevalence in MAST-Data).

Question 2

How does Pisama detect task derailment?

Accepted Answer

Semantic Similarity: Compares embedding distance between task description and output Topic Drift Detection: Tracks topic focus using keyword clustering Task Substitution: Identifies when agent addresses a related but different task Coverage Verification: Checks whether the core task requirements are addressed

Question 3

How accurate is the task derailment detector?

Accepted Answer

F1 0.820, precision 0.702, recall 0.985 on the Pisama calibration set.

Task Derailment

Examples

Detection methods

Calibration accuracy