Your agents have no accountability layer. We are it.
Auditability, scope containment, cascading-failure isolation, and regulator-grade retention — built once across LangGraph, OpenClaw, n8n, Dify, and Managed Agents. 53 production detectors, signed remediation, evidence packaged for every call legal will ever ask about.
What you can’t currently SLA
You can SLA uptime, latency, error rate. Correctness drift is the metric every customer cares about and no platform team can currently promise. These are the six gaps that show up in every production multi-agent post-mortem.
- Auditability
Every detection and every healing action is signed and replayable. When legal asks what the agent did and why, the trail exists in one query — not stitched together from three observability tools.
- Scope creep in prod
Agents drift toward more agency than they were designed for. Pisama baselines permissions and tool grants against the deploy snapshot, then trips when the running agent acquires capability it was not shipped with.
- Customer-facing sycophancy
The sycophancy detector (F1 0.902) flags agreement-shaped responses that contradict the underlying evidence. Brand damage from an agent telling a customer what they want to hear, caught at the turn it happens.
- Cascading failures
Graph-aware circuit breakers cap blast radius. One agent producing a bad output is contained at its downstream boundary instead of poisoning every consumer in the orchestration.
- Regulatory exposure
Every model call and tool invocation retained for the regulator window — not sampled, not aggregated. Specification-compliance detector (F1 0.966) flags drift from the written behaviour spec the moment it lands.
- No SLA on correctness
You can SLA uptime everywhere else. Correctness drift had no metric and no recourse. Pisama is that layer: a measurable, monitorable, reportable correctness signal customers can hold the platform to.
Framework-native detector suites
Six detectors per framework, each tuned to the failure modes that framework actually produces. Calibrated against golden traces from real production orchestrations, not synthetic benchmarks.
LangGraph
6 detectors · operates on StateGraph snapshots
state_corruptionRequired fields disappear or mutate between nodesedge_misrouteConditional edges fire on stale state and misroute the runrecursionGraph recurses past safe depth before checkpointingcheckpoint_corruptionResumed checkpoint diverges from the persisted state hashparallel_syncConcurrent branches write to the same key without a reducertool_failureTool node returns a payload the next node cannot consume
OpenClaw
6 detectors · reads append-only session logs
session_loopSame prompt + tool sequence repeats with no state advancechannel_mismatchSubagent posts to a channel its parent is not listening onspawn_chainSubagent spawn depth grows past sane orchestration boundssandbox_escapeTool call references a path or capability outside the sandboxelevated_riskPrivileged operation requested without matching consent recordtool_abuseHigh-cost tool invoked repeatedly inside one tick
n8n
6 detectors · workflow graph analysis
cycle_detectionWorkflow contains a cycle that can never terminate cleanlycomplexityBranching factor and depth exceed maintainable thresholdstimeoutNode exceeds its bounded execution windowschemaDownstream node receives a payload shape it cannot bindresourceMemory or queue pressure crosses the soft limiterrorUntrapped error propagates past the catch boundary
Dify
6 detectors · workflow + RAG instrumentation
iteration_escapeIteration block exits before processing every input rowclassifier_driftIntent classifier confidence collapses across recent runsrag_poisoningRetrieved chunks contradict the source-of-truth datasetmodel_fallbackPrimary model fails silently and the fallback degrades qualitytool_schema_mismatchTool input schema diverges from the model output shapevariable_leakWorkflow variable leaks across iteration scope boundaries
Managed Agents
6 detectors · OpenAI, Bedrock, Vertex
session_corruptionSession memory writes a value its next read cannot parsesession_stallAgent stops emitting steps while still inside the assistant turncost_overrunSession cost crosses the configured ceiling mid-runenvironment_escapeTool call references a resource outside the managed scopemcp_failureMCP server connection or schema breaks during the sessiontool_permissionTool invoked without the matching capability grant
28+ multi-agent failure modes
On top of the framework-native suites, Pisama runs a layer of detectors that read multi-agent behaviour directly. Coordination and delegation across handoffs. Consensus collapse in voting ensembles. Persona drift between turns. Spawn loops in supervisor trees. Convergence plateaus across iterations.
- coordinationTwo agents work against each other
- delegationTask handoff loses critical context
- consensus_collapseVoting agents converge on a wrong answer
- decompositionSubtask split misses or duplicates work
- persona_driftAgent breaks its role across turns
- workflowStep order violates the workflow contract
- convergenceMetric plateau or thrashing across iterations
- entanglementAgents share state they should not see
- agent_graphCycle or unreachable node in the agent graph
- subagent_spawn_loopSubagent recursively spawns subagents
Plus 27 turn-aware variants of the single-agent detectors, so hallucination and grounding and context checks register per-step instead of per-completion.
Healing engine, risk-gated
Detection without action is just another dashboard. Every fix Pisama proposes carries a risk tier. SAFE fixes auto-apply. MEDIUM fixes auto-apply with inline verification. DANGEROUS fixes escalate to a human with the evidence packed in.
Config-only changes. Retry limits, circuit breakers, execution timeouts, loop breakers, checkpoint recovery, state validation. Auto-applied without human review because they cannot change agent behaviour beyond the bounded knob they touch.
Guardrails that change behaviour inside a known envelope. Exponential backoff, context pruning, summarization, window management, schema enforcement, deadlock prevention, task decomposition. Auto-applied with inline verification before the fix lands in the running orchestration.
Anything that touches prompts, system messages, role boundaries, input filtering, or permission grants. Pisama packages the evidence, ranks candidate fixes, and escalates to a human. Approval logged, fix replayed against the trace before merge.
Pricing
Team and Enterprise tiers ship the framework-native detector suites, the multi-agent failure-mode layer, and risk-gated healing. See the full plan grid at /pricing.