OpenAI Agents SDK failure detection

The OpenAI Agents SDK introduced handoffs as a first-class primitive in 2025. Pisama detects the failures that handoff-based architectures introduce: handoff drift (agent A hands off but agent B never received the context), run-loop corruption (state mutated mid-handoff), and tool failure cascades across handoff boundaries.

The adapter instruments the `Runner` and emits OTel spans for every agent invocation, tool call, and handoff transition. It works with both the Responses API and Assistants API runners.

Detectors specific to OpenAI Agents SDK

  • Handoff drift
    Receiving agent missing context-critical fields from sender
  • Tool failure cascade
    F1 0.900: repeated tool errors across handoffs
  • State corruption
    F1 0.809: type/shape changes across run-loop turns
  • Loop detection
    F1 0.830: handoff cycles between agent pair
  • Persona drift
    F1 0.794: agent operates outside declared instructions

Install

pip install pisama pisama-auto
from pisama.auto import instrument_openai_agents
from openai_agents import Agent, Runner

instrument_openai_agents()
agent = Agent(name="...", instructions="...", handoffs=[...])
result = await Runner.run(agent, input="...")

FAQ

Does this work alongside the OpenAI tracing UI?
Yes. Pisama emits its own OTel spans and runs detectors locally; it does not replace OpenAI tracing. You can ingest the same spans into both backends.
How do you detect handoff drift?
When agent A invokes a handoff, Pisama snapshots the conversation state and the handoff payload. After agent B starts, the detector verifies that key entities (numbers, dates, IDs, items tagged CRITICAL) from A appear in B context. Missing critical entities are flagged.

See the full detector taxonomy at /taxonomy, benchmark numbers at /benchmarks, or compare against other observability stacks at /vs.