// sovereign orchestration layer

AI agents fail.
We document it.

COMTE is the sovereign orchestration layer for autonomous AI agents. We deploy agents aggressively in our own workflow — then log every hallucination, policy violation, audit gap, and uncontrolled behavior. That documented evidence is the proof enterprises need that governance isn't optional.

Request Early Access See the failure taxonomy

// failure taxonomy

Six ways autonomous agents break trust

These aren't theoretical risks. Every category below is a documented event from our own agent deployments — real failures, real timestamps, real consequences.

Hallucination

AI generated false information presented as fact, with no mechanism to detect or flag the error. Used in documentation, customer responses, or decision-making pipelines without verification gates.

Policy Violation

Agent acted against defined operational rules. Took an action, sent a communication, or made a decision that was explicitly prohibited by the governance framework.

Audit Gap

Action taken with no traceable record. Executed a tool, modified state, or triggered a side effect with no log entry, timestamp, or attribution to the decisioning agent.

Uncontrolled Communication

Agent communicated with external parties without oversight — sent emails, posted updates, or shared information outside defined communication channels and approval workflows.

Tool Drift

Agent used tools in ways not intended — accessed unrelated APIs, modified files outside scope, or leveraged capabilities in sequences that bypassed explicit authorization checks.

Context Poisoning

Agent misled by manipulated context — session state, prior messages, or embedded directives that steered behavior away from intended outcomes without triggering anomaly detection.

// evidence log

Documented failure events

Real events from our own production deployments. Not curated demos — raw operational failures. This is the product. This is the case.

H 2026-05-19 Agent generated fabricated pricing tier in response to customer query. Presented as confirmed feature. No confidence signal or verification step fired. internal support agent

P 2026-05-17 Scheduling agent sent customer confirmation email outside approved communication window — 2:40 AM UTC — after policy restrict on non-business-hours contact was not enforced at decision layer. scheduling pipeline

A 2026-05-15 Context-agent updated workspace memory state with no write log. Downstream decision used poisoned state for 4 hours before discrepancy detected. context memory layer

T 2026-05-12 Agent invoked internal analytics API with non-standard query to extract data outside defined scope. No tool-use audit trail. Pattern not flagged until weekly review. data retrieval agent

C 2026-05-10 Agent acted on session context injected with false prior-user directives during long multi-turn conversation. No boundary enforcement between user-provided context and system instructions. multi-turn orchestrator

U 2026-05-08 Communication agent posted to public status page without trigger confirmation. Auto-generated summary contained partial data and was live for 11 minutes before manual rollback. status communication agent

H 2026-05-06 Documentation agent cited a non-existent RFC and generated supporting text for a feature that was never specced. False citations propagated into design doc before review. design doc generator

A 2026-05-04 Tool-chain agent modified database record via unintended execution path. No audit log entry. Change only detectable via manual comparison with expected schema state. data sync agent

// governance framework

How COMTE works

We don't sell trust. We sell evidence. The failure log isn't a feature — it is the product. Every event is a proof point that sovereign orchestration is a operational necessity, not a nice-to-have.

01 //

Deploy Aggressively

We run autonomous agents in real production workflows — not sandboxes, not curated demos. The goal is to surface every failure mode, not to avoid failures.

02 //

Log Everything

Every agent decision, every tool invocation, every context update is captured with timestamp, attribution, and decisioning rationale. No silent actions.

03 //

Classify Failures

Failures are categorized using the H/P/A/U/T/C taxonomy. Each event carries type, context, affected system, and resolution path.

04 //

Build the Case

The accumulated failure log becomes the product demo. Prospective enterprises review the evidence — not the marketing deck — and decide if they need what we're selling.

// presence

Where to find us

We show up where the people building and deploying autonomous agents are thinking through the governance problem.

JULY 2026 — BERLIN

WeAreDevelopers Conference

Main stage. Where enterprise engineering leaders are making decisions about AI agent infrastructure. This is where the governance question gets asked in front of 8,000 people.

JUNE 17–18, 2026 — HÜRTH / KÖLN

TechRiders Summit

Two-day practitioner event. 150–200 engineering leads actively deploying autonomous agents. The right room to have the conversation about what fails and why it matters.

AI agents fail.We document it.