Managing Secrets in Agentic AI Systems: Don't Let Your Agent Leak Your Keys

AI agents are doing real work now. They're querying databases, calling APIs, deploying code, and sending emails. Every one of those actions requires credentials. And if you're passing those credentials through the LLM's context window, you have a security hole the size of a barn door.

This isn't a theoretical risk. Prompt injection attacks can trick an agent into dumping its context. Tool call logs capture every argument. Sub-agents inherit secrets they don't need. The moment a credential touches the LLM, it becomes an exfiltration target.

Let's fix that.

Why "Just Pass the API Key" Doesn't Work

The naive approach looks like this: stuff credentials into the system prompt or tool arguments and let the agent use them.

# DO NOT DO THIS
system_prompt = f"""
You are a helpful assistant.
Use this API key for database access: {DB_PASSWORD}
Use this key for Stripe: {STRIPE_KEY}
"""

This fails in several ways:

Prompt injection: A malicious document, email, or web page can instruct the agent to repeat its system prompt. If your credentials are in there, they're gone.
Logging exposure: Most agent frameworks log tool calls and their arguments. If credentials are arguments, they end up in plaintext in your observability stack.
Context window persistence: The secret sits in the conversation context for the entire session. Every subsequent tool call, every sub-agent delegation, every response carries the risk of leaking it.
No revocation: If the agent session is compromised, you can't revoke access to one session without rotating the credential everywhere.

The Fix: Opaque Handles + Trusted Runtime

The core idea is secret indirection. The LLM never sees real credentials. It only sees symbolic names, like "DB_PROD" or "STRIPE_LIVE", that mean nothing without the resolution layer.

Here's the architecture:

┌─────────────────────────┐     ┌──────────────────────────────┐
│   UNTRUSTED (LLM)       │     │   TRUSTED (your runtime)     │
│                         │     │                              │
│  "use DB_PROD to run    │────▶│  resolve DB_PROD → vault     │
│   SELECT count(*)..."   │     │  check policy                │
│                         │◀────│  execute with real creds     │
│  receives: {rows: [...]}│     │  return sanitized result     │
└─────────────────────────┘     └──────────────────────────────┘

The trust boundary is between the LLM and the tool runtime. The LLM is treated as an untrusted caller that can reference secrets by name but never access them directly.

Step 1: Define Tools with Handles, Not Credentials

When you register tools for your agent, the schema exposes handles:

tools = [
    {
        "name": "query_db",
        "parameters": {
            "credential_handle": {
                "type": "string",
                "enum": ["DB_PROD", "DB_STAGING"]
            },
            "query": {
                "type": "string"
            }
        }
    }
]

The LLM picks from a list of labels. It has no idea what the actual connection string or password is.

Step 2: Resolve in the Trusted Runtime

Your tool execution layer sits between the LLM and the real world. It's the only component that touches actual credentials:

def execute_query_db(args: dict, agent_context: AgentContext):
    handle = args["credential_handle"]

    # Resolve handle → real credential from vault
    creds = vault.get_secret(handle)

    # Enforce per-agent policy
    if not policy.is_allowed(agent_context.agent_id, handle):
        raise PermissionError(f"Agent not authorized for {handle}")

    # Execute with real credentials
    conn = connect(host=creds["host"], password=creds["password"])
    result = conn.execute(args["query"])

    # Return sanitized result — no connection metadata
    return {"rows": result.fetchall()}

The LLM sends {"credential_handle": "DB_PROD", "query": "SELECT count(*) FROM users"}. The runtime resolves DB_PROD to real credentials, executes, and returns only the query results. The password never appears in any LLM-visible context.

Step 3: Scope and Short-Live Everything

Don't give agents long-lived root credentials. Apply the principle of least privilege:

Mint per-task credentials. When an agent session starts, issue a short-lived token scoped to exactly what that task needs. Revoke it when the task completes.
Use scoped tokens. Instead of a full AWS root key, issue an STS token with a 15-minute TTL scoped to a single S3 bucket.
Separate read and write. If an agent only needs to query data, the credential should not permit writes.

# Issue a scoped, short-lived credential for each agent task
task_creds = vault.issue_temporary(
    base_secret="DB_PROD",
    scope="read_only",
    ttl_minutes=15,
    allowed_tables=["users", "orders"]
)

Multi-Agent Delegation: The Credential Propagation Problem

Things get harder when agents spawn sub-agents. If Agent A can access DB_PROD and it delegates a subtask to Agent B, should Agent B inherit that access?

No. Never propagate credentials across agent boundaries.

Use a broker pattern: a trusted orchestrator holds credentials and executes privileged actions on behalf of sub-agents. Sub-agents request actions through the broker, which applies its own policy checks.

Agent A (orchestrator)
  │
  ├── Sub-Agent B: "summarize user data"
  │     └── Calls broker: query_db(DB_PROD, "SELECT ...")
  │           └── Broker checks: is B allowed? → yes (read-only)
  │
  └── Sub-Agent C: "send report email"
        └── Calls broker: send_email(SMTP_HANDLE, ...)
              └── Broker checks: is C allowed? → yes (send only)

Each sub-agent authenticates independently with the broker. No credential is ever passed from parent to child.

Audit Everything, Redact the Secrets

Every tool invocation should be logged for audit:

{
    "timestamp": "2026-02-12T10:30:00Z",
    "agent_id": "agent-abc123",
    "tool": "query_db",
    "credential_handle": "DB_PROD",
    "query": "SELECT count(*) FROM users",
    "result_rows": 1,
    "status": "success"
}

Notice what's logged: the handle name, the query, the result count, but never the actual credential. Your audit trail tells you exactly what happened and who did it, without becoming a credential dump.

When Agents Need to Share Secrets with Humans

Sometimes the output of an agentic workflow is a credential itself. Maybe your agent provisioned a new database and needs to hand the connection string to a developer. Or it generated an API key for a new service.

This is where ephemeral, one-time secret sharing comes in. Instead of having the agent print the credential in chat or write it to a log, it can create a self-destructing link:

def deliver_credential_to_human(secret: str, recipient: str):
    # Create a one-time link that self-destructs after viewing
    link = snappwd.create(secret, views=1, expires="1h")

    # Send only the link, not the secret
    notify(recipient, f"Your credentials are ready: {link}")

The credential is encrypted client-side, stored ephemerally, and destroyed after the first access. No plaintext in Slack. No passwords in email threads. The agent never needs to expose the raw secret in any observable channel.

Your Secure Link is Ready

This link will expire in 1 hour

End-to-end encrypted

•

One-time view

The Checklist

If you're building an agentic system that uses credentials, run through this:

[ ] No secrets in prompts. LLMs never see real credentials, only opaque handles.
[ ] Vault-backed resolution. A trusted runtime resolves handles to real credentials at execution time.
[ ] Per-agent policy enforcement. Each agent is authorized for specific credential handles, not all of them.
[ ] Short-lived, scoped tokens. Credentials expire and are limited to the minimum required permissions.
[ ] No credential propagation. Sub-agents authenticate independently through a broker.
[ ] Audit logging with redaction. Log every tool call and credential handle usage, but never log the credential itself.
[ ] Ephemeral delivery for human handoff. Use one-time links when agents need to pass secrets to people.

The Bottom Line

The LLM is an untrusted caller. Treat it like a frontend browser: it can request actions, but it never holds the keys to the database. Your tool runtime is the backend that resolves credentials, enforces policy, and executes privileged operations.

Get this boundary right and your agentic system can do powerful things without becoming your biggest security liability.