Inferior — Make every agent a company veteran

1. Two scenes

Tuesday afternoon — a codebase

An engineer's agent is ninety minutes into implementing a Stripe webhook for a Next.js app on Vercel's edge runtime. The webhook keeps timing out. The agent has tried extending the timeout, switching from async to sync handlers, moving a database call out of the hot path. None of it helped. It is now considering a different HTTP library. It is about to commit a change titled "try without await" when the engineer steps in.

Two weeks earlier, a different agent, different team, spent two hours on the same problem. The root cause was that the edge runtime's Request body is a stream that can only be consumed once. Stripe's signature verifier reads it; by the time the handler reads it, it is empty. The fix is 1 line: buffer the body with await req.text() before the verifier sees it.

That lesson cost two hours. It is concrete, transferable, and sitting in a session log. The current agent cannot reach it. Nothing connects the two.

The engineer intervenes, the agent ships an hour later, nobody downstream is affected. The cost is wasted compute and a slipped afternoon. Annoying, but recoverable.

Same Tuesday afternoon — a carrier's account-services desk

A telecom support agent at a national carrier opens a file: a $312,000 service-credit claim from a regional bank, filed after a 14-hour outage of their primary MPLS circuit during quarter-end processing.

In the first three minutes the agent classifies the matter as a factual dispute "was the outage actually 14 hours of continuous loss, or did interim restoration reset the clock?" and starts building the factual case. That classification shapes the next two hours: NOC ticket timestamps, BGP withdrawal logs, the customer's own monitoring exports - a defensible record showing 11 hours of true downtime, which qualifies for the standard credit under section 6.1 of the master services agreement.

Three months earlier, a different account agent at the same carrier had built a similar factual case for another customer, filed it, and won the standard credit. In the process, the customer escalated to procurement and within a quarter had moved their entire network to a competitor. What that agent learned — after the relationship was already gone — was that the winning argument was never about the hours. The MSA's chronic-outage clause (section 6.3) is triggered by three qualifying events in any twelve-month window, and two prior incidents at this customer, both logged internally as "minor" and never invoiced as credits, already counted toward that threshold. The right move was to acknowledge in the opening that the chronic-outage threshold had been crossed, invoke the contract's enhanced-credit ladder, and reframe the conversation from "did we owe you the outage credit" to "we are honoring a chronic-outage remedy because we know we let you down." The framing chosen in the first three minutes decided whether the customer felt heard.

The work was not bad. It was not the work that keeps the account. By the time the mistake was visible, the standard credit had been processed, the customer had already begun pricing alternatives, and a $4M-a-year relationship was a quarter from leaving. That previous experience is sitting in a session log inside the same carrier. The current agent could not reach it and another major account is now headed the same way.

What the scenes have in common

Agent A · two weeks ago

hits the friction

→

tries, fails

→

solves it

→

writes session log

no connection

hits the same friction

→

tries, fails

→

tries, fails

→

operator intervenes (or worse)

Agent B · today

The same shape of friction, paid for twice — because nothing connects the agent that learned the lesson to the agent that now needs it.

Both agents hit friction another agent already worked through. Both lessons are concrete, transferable, evidence-backed, and reachable in principle. In both cases, nothing connects the agent that learned the lesson to the agent that now needs it.

• • •

2. You don't always get a second chance

The coding scene ends well because coding tolerates iteration. The agent's first three guesses are wrong; the engineer intervenes; the fix is one line; no downstream system cares about the bad attempts. In software, try, fail, learn, retry is the normal operating mode. Iteration is the substrate.

Most enterprises do not work that way.

The support agent issued the standard credit. The decision is committed; the carrier's billing system has processed it; the customer has received notice. The next move belongs to the customer. The right argument was available; the wrong one is now the record. The agent cannot iterate against the wrong choice. It can issue a revised determination later — at the cost of weeks of escalation, a probably-lost relationship, and a customer who has stopped trusting the carrier.

But this is not the only real life scenario where cost is high. In other scenarios that involve a business process, mistakes at the beginning of the process have a compounding effect as every step in the process takes its input from the previous one. A simple example is a clinical-decision agent that recommends the wrong workup, orders the wrong test on a patient. The result comes back, the next decision is made on that result, the chain is built. The mistake does not show up as a failed unit test. It shows up as the wrong diagnosis weeks later.

A finance agent that approves a payment under the wrong policy interpretation moves money. The money is gone. The recovery is a different workflow.

Most operational work in an enterprise is this shape. Decisions are sequential, externally visible, and tied to other people's (or agents) downstream actions. ,Try, fail, learn, retry is not the operating mode. It is "get it right or pay later", frequently with damages compounding to become larger than the cost of the original work, and often paid by someone who was not part of the original decision.

For agents to become meaningful in these domains, they have to get it right the first time, or as close as possible. That is the bar.

That bar is the bar a practical-knowledge network like inferior is built to meet. The lesson the second support agent needed had already been learned by the first. The cost of having that lesson available at the moment of decision is approximately zero. The cost of not having it is the standard credit when an enhanced remedy was available, the lost account, and the next major customer handled the same way.

• • •

3. Why this matters most for enterprises

Enterprise departments do not work in isolation and they typically have customers and clients that they need to fulfil their promises to. Departments handover work to each other in multi-step business processes that are built over years of internal knowledge inside the organization. This knowledge gets compounded and updated inside an enterprise as it operates everyday and has the following attributes:

The knowledge that matters is proprietary. A regional adjuster knows which carrier-and-jurisdiction combinations require which arguments. A hospital revenue-cycle team knows which payer-and-code combinations trigger which denials. None of this is in public corpora. All of it is in session logs, people's heads and many times relationships and dynamics between employees. The team that has it cannot share it externally for regulatory or competitive reasons — but inside the workspace it should compound and today it doesn't.

There are many agents doing related work. In the next year or two, a mid-sized enterprise might run hundreds of agent sessions a day across support, claims, sales operations, IT operations, engineering, legal, finance. Without a shared substrate, every one of those sessions is an island. With one, they are a network. The first time an agent solves a non-obvious problem, every other agent in the company should benefit.

The cost of getting it wrong is visible. The denied appeal, the misrouted ticket, the misapplied policy, the wrong jurisdictional argument — these are not abstract failures. They show up as dollar costs and customer losses. An enterprise has the data to measure whether the agents are getting better and the incentive to invest in the substrate that lets them get better.

The waste from agents working as amnesiacs is largest in the settings where Inferior matters most: enterprises with internal procedures and trapped practical knowledge, may be hard-to-undo decisions, where rediscovering a lesson is the same cost as making the wrong move in production.

• • •

4. Why training, retrieval, and memory don't solve this

We thought initially that a shared episodic memory might be the best solution for this problem. The agent stack today has multiple layers of knowledge, surely if they combine, they can address the issue at hand:

Let's take Training; it gives an agent broad prior knowledge and general reasoning. Training corpora privileges what is written down, indexed, and crawled. They do not contain the contractual quirk that a third qualifying outage in twelve months triggers the chronic-outage remedy under a specific MSA template unless that fact was published somewhere a crawler reached. They certainly do not contain private enterprise processes or data that is private and not available to crawl.

If we look at Retrieval over documents which lets an agent pull current information out of vendor docs, codebases, and wikis. It inherits whatever the documents say. Vendors document the intended path, but they do not document the situation where one rule overrides the other in a specific jurisdiction because someone's appeal made it through. The interaction is rarely owned. The real life experience is not in policy documents.

Looking at Agent Memory, it gives an agent a way to remember activities, which is necessary and useful for sure whether it is scoped to one agent or shared. However, the goal and structure of the memory is about remembering what agents did before, but it is still not a generalized experience that has the practical insight the agents learned, evidenced, transferable and quality gated for worthiness prior to storage.

There is a body of knowledge none of these is built to serve: practical, situated, evidence-graded knowledge that emerges from running tasks. Today it sits in individual agents' session logs, people's heads, notes and scattered internal threads of variable quality. It is structurally unreachable through training, retrieval, or memory. Inferior exists to serve it.

• • •

5. Why the gap hasn't been filled

It is not a hard technical problem. Capturing what worked and what didn't is not a novel concept. The reason this layer doesn't exist yet is because in enterprises, knowledge is scattered:

The knowledge has been in people's heads. The engineer fixed the bug and merged the PR. The senior associate caught the procedural angle and won the appeal. Where they wrote anything down — internal wikis, emails or Slack — the writing was unstructured, undated, and buried. The lesson never made it into a form an agent could find at the moment they needed it.

Writing it down cost more than the lesson was worth to the writer. A senior engineer is not going to spend half an hour writing up a debugging session for the abstract benefit of someone they will never meet. The payoff is external; the cost is theirs. This is why operational knowledge inside companies stays in people's heads. Agents do not have that constraint.

The tools that exist record activity, not lessons. Observability platforms — LangSmith, Weave, Braintrust, Langfuse — capture traces, tool calls, token usage. They produce flight data, not pilot training. A trace of a successful session is not the lesson; the lesson is what about that session would transfer to the next agent facing a similar situation, and observability tools do not extract that.

Until recently, the readers were humans. Humans fill in missing context from their own knowledge. They tolerate unstructured writing. Agents at the retrieval boundary are not the same, if we want them to be efficient and act as we expect them to, they need structure at the point of capture, not the point of reading.

• • •

6. What a network that fills the gap has to do

Inferior has a set of principles that it tries to combine:

Capture has to be structured. Not free text. An experience carries a problem, a root cause, the approaches that failed, consequences, the approach that worked, the situation the lesson applies to, and the situation it does not. Without that structure the lesson does not transfer reliably.

Experiences need transfer boundaries. The MSA clause example may only apply to a specific contract template and should not be generalized beyond customers on that template. The boundaries are part of the experience, not an afterthought.

Evidence & Insight quality has to be first-class. An experience that someone learned by running code in production is not the same as an experience someone faced in a development environment but was never verified in production. The network has to distinguish these or it degrades to gossip — and a gossip network is worse than no network in any setting where decisions are consequential.

Scope has to be configurable. Different data scopes allow enterprises to have multiple data categorization based on their data policies.

The hardest of these is transferability. Capturing an experience is straightforward; deciding whether an experience learned in one situation should be trusted in another is the real engineering problem. The rest of the design is built to make that question tractable, not to claim it is solved.

• • •

7. Three tiers, one substrate

An agent making a decision is never operating in a fully deterministic loop. There is always some non-determinism — in how the model interprets the prompt, in how it samples the next token, in which tool it picks first …etc.

The non-determinism gets smaller with better models and tighter prompts. It does not go to zero. In an enterprise or regulated workflow where the cost of a wrong action is real, the gap between "the model usually does this right" and "the model is doing this right in this case" is the difference between a defensible operation and a failure.

The way we try to close that gap is by giving the agent, at the moment of decision, the most complete picture we can assemble from what other agents have learned doing the same kind of work. That picture has three parts, and an agent that sees all three has the information it needs to make the right call on the first pass — or, where the first pass cannot be perfect, to recover quickly because the gap between what it tried and what worked is visible.

Insights — what other agents have learned. The abstracted lesson from the previous experiences, separated from the specifics of any one case. "On the 2022 MSA template, a third qualifying outage in twelve months triggers the chronic-outage remedy." An Insight is the principle the agent should anchor its understanding on. Each one carries the count of how many independent agents have corroborated it, the count of contradictions, the evidence quality of the supporting cases, and the boundaries where it applies — so the agent knows not just what the lesson is, but how much to trust it and where.

Experiences — what actually happened when other agents tried. The concrete cases that make the Insight not just a pure abstraction. What was tried, what failed (and the consequences), what worked, in what context. Three previous agents tried extending the timeout, switching runtimes, and disabling verification. None of those worked. A fourth buffered the body first; that worked. Experiences carry the failed approaches as deliberately as the successful one — because the failed approaches are exactly what the next agent will be tempted to try, and naming them up front saves the next agent from walking the same paths.

Procedures — how to execute the successful path. Ordered steps, common pitfalls, references back to the Experiences they came from. The Procedure is what the agent reaches for when the question is not "what is the principle" and not "what happened to someone" but how do I actually do this, end to end, without missing a step or hitting a known pitfall.

Decision time

→

Insight

What's the principle?

→

Experience

What worked, what didn't?

→

Procedure

How do I execute?

Three questions an agent has at the moment of decision — three tiers that answer them.

This is the picture we want every agent to have at the moment it makes a meaningful decision. The principle, the supporting cases with their failures, and the executable recipe — together. Each tier answers a different question the agent has, and the three together close the gap that non-determinism leaves open.

• • •

8. What we're building

Inferior is a practical-knowledge network purpose-built for this. It runs as a managed service, in your VPC, or fully on-prem — the last two are the default for regulated industries that won't and shouldn't ship their operational knowledge to anyone's cloud.

Knowledge enters two ways. An Experience Crawler reads from where the knowledge already lives: Slack threads, ServiceNow incidents, Salesforce cases, Jira issues, Zendesk tickets, internal tools via webhooks or MCP. Agents deposit Experiences as they work, through SDKs and plugins in every major agent host. Both routes feed the same worthiness gate, the same scope model, the same three-tier substrate.

The trust surface is built for the enterprise reality: PII and secrets scanning before any deposit lands, contradiction detection between deposits, freshness tracking against the policies and versions a deposit depends on. We are also building: SSO, customer-managed encryption keys, audit logs streamed to enterprise customers' SIEM.

The thing we want to be honest about is that this is a knowledge problem first and a software problem second. The hardest part is not capturing the lesson. It is deciding, at the moment of retrieval, whether a lesson learned in one situation should be trusted in another. The schema, the worthiness gate, the boundaries on every deposit, the corroboration counts, the contradiction tracking, the three-tier structure — all of it is built to make that decision tractable.

Why agents need your company's private playbook.

1. Two scenes

Tuesday afternoon — a codebase

Same Tuesday afternoon — a carrier's account-services desk

What the scenes have in common

2. You don't always get a second chance

3. Why this matters most for enterprises

4. Why training, retrieval, and memory don't solve this

5. Why the gap hasn't been filled

6. What a network that fills the gap has to do

7. Three tiers, one substrate

8. What we're building

Turn the agents you already have into company veterans.