AI agentsauthorizationAPI keysmandatessecurity

Why API keys are the wrong auth primitive for AI agents

8 min read

The key in the config file

You have built an agent that calls a real API. Maybe it books calendar slots, submits purchase orders, or triggers a deployment. At some point you needed a credential, and the service offered the path of least resistance: generate an API key, paste it in your environment, move on.

The agent works. The integration is alive. But something nags. That key has no expiry. It can do everything your account can do. If the agent misbehaves - or if the key leaks - the blast radius is your entire account. You put a note in the README to rotate it quarterly and tried not to think too hard about it.

This is not a story about poor security hygiene. It is a story about the wrong primitive. API keys were designed for a world where the caller is a service, the action is bounded by the service's own RBAC, and "authenticated" is close enough to "authorised." None of those assumptions hold when the caller is an autonomous agent acting on behalf of a human principal who is not present, may not have anticipated this specific action, and has no easy way to revoke what they already set in motion.

What a key actually says

An API key is a shared secret. When you present it, the server confirms: the caller possesses this secret. That is the full content of the authentication claim.

It does not say who the human principal is. It does not say what action they intended. It does not say how much they are willing to spend, which resources they've consented to touch, or when that consent expires. The key is bearer-based: whoever holds it, the server treats as authorised.

For a human developer querying their own data, that is fine. The developer decided to make the call; the call carries the key; the server executes. The chain of intent is short and fully in the developer's head.

For an autonomous agent, the chain is broken. The human set up the agent hours or days ago. They said "book me a hotel in Paris for next week" and closed the laptop. The agent is now acting - making calls, spending money, modifying state - and the only credential in the request is a secret that proves nothing about whether the principal anticipated this specific action at this specific moment.

The OAuth interlude

OAuth 2.0 improves things. Scopes express coarse intent. Tokens expire. The authorization server can revoke them. Many teams reach for OAuth when they realise API keys are too coarse, and it is a genuine improvement.

But OAuth was designed around a human being present at the authorization step. The canonical flow: the user is redirected to the authorization server, they log in, they consent to a list of scopes, they are redirected back. The access token that results reflects a deliberate act by a present, identified human.

When you use OAuth in an agentic context, one of two things happens. Either you do a one-time consent and hand the agent a long-lived refresh token - in which case you are back to the "shared secret that never expires" problem with extra steps - or you use client credentials, which drops the human consent step entirely and gives you service-account semantics: a machine credential with no human principal attached.

Scopes help but they are coarse and static. Granting bookings:write authorises every booking action, forever, up to token expiry. It says nothing about the specific room the principal wanted, the price they agreed to, or the fact that they expected a single booking, not forty. The token is still a bearer credential; it just has a list of verbs attached.

Why the mental model breaks for agents

The fundamental mismatch is between the granularity of human intent and the granularity of the credentials we use to represent it.

When a human books a hotel room on a website, they consent to a specific action at a specific price at a specific moment. The credit card charge, the confirmation email, the booking reference - all of it is anchored to a deliberate act. The consent and the action are temporally and causally linked.

When an agent acts, the principal's consent happened earlier, upstream, at a different granularity. They said "book me somewhere nice under €200 a night" and walked away. Every specific action the agent takes from that point forward is an interpretation of that intent, not a fresh explicit consent. The key or token the agent carries was created before any of those specific actions were known.

This creates three gaps that keys and OAuth cannot close on their own:

  • The scope gap - the credential authorises a category of action, not the specific action being taken. The provider cannot tell whether this particular invocation was intended or whether the agent went off-script.
  • The spend gap - nothing in the credential tells the provider how much the principal is willing to pay. The provider charges what they charge; the agent's key still works. There is no ceiling the provider can enforce on the principal's behalf.
  • The attribution gap - if something goes wrong, the server logs show an authenticated request from a key or token. They do not show which principal authorised it, what they understood they were authorising, or whether the agent acted within or outside the principal's intent. Debugging and dispute resolution become archaeological exercises.

The prompt injection risk

These gaps matter more when the agent is adversarially targeted. A prompt injection in a retrieved document can instruct the agent to take actions the principal never intended. With a shared API key, the injected action looks identical to a legitimate one in the server logs. With a scoped mandate signed at intent time, an unexpected action either lacks a valid credential or falls outside the signed scope - both are detectable and rejectable.

What mandates are and why they exist

A mandate is a credential that is scoped to a specific action, signed by the principal at the moment of intent, and verifiable by the provider without a callback. It closes all three gaps at the protocol layer, not by convention.

The key difference from a key or token: a mandate carries the principal's commitment to a specific invocation, not a blanket authorisation over a category of actions. Before the agent calls book_room, it signs an IntentMandate that names the exact merchant and action. That mandate is short-lived - valid for seconds, not hours. The provider can verify it without touching any external service. If the agent calls anything outside the scope of the mandate, the call is rejected at the gateway before it reaches the provider.

Closing the scope gap

An IntentMandate binds to a specific merchant and a specific action. A mandate for book_room at Lux Hotels cannot be replayed against cancel_roomor used against a different provider. Scope is not a string that can be interpreted broadly; it is a cryptographic commitment to a specific target. The provider does not need to trust the agent's self-reported intent - the intent is embedded in the credential and verified before the call proceeds.

Closing the spend gap

The mandate flow includes a price lock step. The agent signs an IntentMandate, the provider returns a CartMandate committing to a specific price, and the agent signs a PaymentMandate confirming that exact total before the transaction executes. The principal's spend cap is set once in the wallet config; the gateway enforces it on every call. The model cannot exceed it regardless of what the provider charges or how many retries occur.

This is not a soft guideline enforced by hoping the agent behaves. It is a hard ceiling enforced by the gateway, independent of the model's judgment. The provider knows the charge is authorised. The principal knows the charge cannot exceed what they agreed to.

Closing the attribution gap

Every mandate is signed with the principal's key. Every invocation that succeeds has a verified mandate attached. The audit record shows not just that an authenticated request arrived, but which principal authorised it, to which action, at which price, and when - in a hash-chained log that neither party can alter after the fact.

When something goes wrong - a runaway loop, a confused agent, an injected prompt - the audit trail answers the question that server logs cannot: was this action inside or outside the principal's authorised mandate?

The properties you get that keys cannot give you

Working through the comparison concretely:

  • Action confinement - the credential is bound to one action at one provider. It cannot be repurposed, replayed, or scope-escalated. A key can be used for any action the account is permitted to take.
  • Time confinement - the mandate expires in seconds. A stolen or intercepted credential is useless after the TTL. A key is typically valid until manually rotated, which in practice means indefinitely.
  • Principal identity - the mandate is signed by the principal's private key. The provider knows who authorised the action, not just which service account called the API. A key identifies a service credential; a mandate identifies a human principal.
  • Spend enforcement - the gateway enforces the principal's cap at call time. No API key can carry spend semantics; the cap exists only in the provider's own rate-limiting logic, if at all.
  • Price lock - the CartMandate step commits the provider to an exact price before the agent confirms. With a key, the provider charges what they charge when the action executes; the principal has no leverage over the final amount.
  • Tamper-evident audit - the hash-chained invocation log means both parties can prove what happened and neither can alter it retroactively. Application logs from key-based calls are mutable and carry no cryptographic proof of authorisation.

What this means in practice

None of this requires rewriting your agent. The wallet sidecar is a signing service the agent calls before each transactional action. It takes the merchant ID and action SKU from the provider's spec, mints a short-lived JWT, and hands it back. The agent includes that JWT in the invocation call. The gateway verifies it before the call reaches the provider.

From the agent's perspective, the flow is not dramatically more complex than calling with a key. The meaningful difference is in what the credential carries: not "I hold a secret that grants access to this API," but "the principal has authorised this specific action at this moment, up to this price, and I can prove it."

For free, read-only actions like search, the flow is even simpler: sign an IntentMandate, attach it to the call. No payment step, no CartMandate. The mandate still enforces action confinement and time-bounding.

The deeper shift

The reason this matters beyond security hygiene is that the trust properties of a credential determine what authority businesses will grant to agents.

A business will give an agent a read key without much hesitation. It might give a write key with some internal review. It will not give an agent the authority to commit purchase orders, book venues, or trigger payments unless it has a way to verify that every such action was explicitly authorised by a human principal with a traceable, time-bounded, spend-capped credential that the agent cannot exceed or repurpose.

API keys do not provide those guarantees. Mandates do. As the actions agents are asked to take move from "fetch data" toward "commit to things that cost money and have legal consequences," the credential needs to carry more than a shared secret. It needs to carry the principal's verifiable intent.

That is not a product distinction or a framework preference. It is a property that emerges from the structure of the credential itself - and it is why keys, however convenient, are the wrong primitive for this class of problem.

Further reading

The full mandate flow - from discovery and intent signing through price lock and settlement - is covered in How AI agents can book hotels without scraping. For the relationship between MCP's tool-calling layer and the trust layer that sits above it, see MCP gives agents tools. What gives agents trust?.