Does this work alongside the OpenAI tracing UI?

Yes. Pisama emits its own OTel spans and runs detectors locally; it does not replace OpenAI tracing. You can ingest the same spans into both backends.

How do you detect handoff drift?

When agent A invokes a handoff, Pisama snapshots the conversation state and the handoff payload. After agent B starts, the detector verifies that key entities (numbers, dates, IDs, items tagged CRITICAL) from A appear in B context. Missing critical entities are flagged.

OpenAI Agents SDK failure detection

Name: Pisama
Author: Pisama

The OpenAI Agents SDK introduced handoffs as a first-class primitive in 2025. Pisama detects the failures that handoff-based architectures introduce: handoff drift (agent A hands off but agent B never received the context), run-loop corruption (state mutated mid-handoff), and tool failure cascades across handoff boundaries.

The adapter instruments the `Runner` and emits OTel spans for every agent invocation, tool call, and handoff transition. It works with both the Responses API and Assistants API runners.

Detectors specific to OpenAI Agents SDK

Handoff drift
Receiving agent missing context-critical fields from sender
Tool failure cascade
F1 0.900: repeated tool errors across handoffs
State corruption
F1 0.809: type/shape changes across run-loop turns
Loop detection
F1 0.830: handoff cycles between agent pair
Persona drift
F1 0.794: agent operates outside declared instructions

Install

pip install pisama pisama-auto

from pisama.auto import instrument_openai_agents
from openai_agents import Agent, Runner

instrument_openai_agents()
agent = Agent(name="...", instructions="...", handoffs=[...])
result = await Runner.run(agent, input="...")

FAQ

Does this work alongside the OpenAI tracing UI?: Yes. Pisama emits its own OTel spans and runs detectors locally; it does not replace OpenAI tracing. You can ingest the same spans into both backends.
How do you detect handoff drift?: When agent A invokes a handoff, Pisama snapshots the conversation state and the handoff payload. After agent B starts, the detector verifies that key entities (numbers, dates, IDs, items tagged CRITICAL) from A appear in B context. Missing critical entities are flagged.

See the full detector taxonomy at /taxonomy, benchmark numbers at /benchmarks, or compare against other observability stacks at /vs.