Managing Secrets in Agentic AI Systems
AI agents need credentials to act on your behalf, but putting secrets in prompts is a disaster waiting to happen. Here's the architecture for doing it right.
AI agents are doing real work now. They're querying databases, calling APIs, deploying code, and sending emails. Every one of those actions requires credentials. And if you're passing those credentials through the LLM's context window, you have a security hole the size of a barn door.
This isn't a theoretical risk. Prompt injection attacks can trick an agent into dumping its context. Tool call logs capture every argument. Sub-agents inherit secrets they don't need. The moment a credential touches the LLM, it becomes an exfiltration target.
Let's fix that.
Why "Just Pass the API Key" Doesn't Work
The naive approach looks like this: stuff credentials into the system prompt or tool arguments and let the agent use them.
# DO NOT DO THIS
system_prompt = f"""
You are a helpful assistant.
Use this API key for database access: {DB_PASSWORD}
Use this key for Stripe: {STRIPE_KEY}
"""
This fails in several ways:
- Prompt injection: A malicious document, email, or web page can instruct the agent to repeat its system prompt. If your credentials are in there, they're gone.
- Logging exposure: Most agent frameworks log tool calls and their arguments. If credentials are arguments, they end up in plaintext in your observability stack.
- Context window persistence: The secret sits in the conversation context for the entire session. Every subsequent tool call, every sub-agent delegation, every response carries the risk of leaking it.
- No revocation: If the agent session is compromised, you can't revoke access to one session without rotating the credential everywhere.
The Fix: Opaque Handles + Trusted Runtime
The core idea is secret indirection. The LLM never sees real credentials. It only sees symbolic names, like "DB_PROD" or "STRIPE_LIVE", that mean nothing without the resolution layer.
Here's the architecture:
┌─────────────────────────┐ ┌──────────────────────────────┐
│ UNTRUSTED (LLM) │ │ TRUSTED (your runtime) │
│ │ │ │
│ "use DB_PROD to run │────▶│ resolve DB_PROD → vault │
│ SELECT count(*)..." │ │ check policy │
│ │◀────│ execute with real creds │
│ receives: {rows: [...]}│ │ return sanitized result │
└─────────────────────────┘ └──────────────────────────────┘
The trust boundary is between the LLM and the tool runtime. The LLM is treated as an untrusted caller that can reference secrets by name but never access them directly.
Step 1: Define Tools with Handles, Not Credentials
When you register tools for your agent, the schema exposes handles:
tools = [
{
"name": "query_db",
"parameters": {
"credential_handle": {
"type": "string",
"enum": ["DB_PROD", "DB_STAGING"]
},
"query": {
"type": "string"
}
}
}
]
The LLM picks from a list of labels. It has no idea what the actual connection string or password is.
Step 2: Resolve in the Trusted Runtime
Your tool execution layer sits between the LLM and the real world. It's the only component that touches actual credentials:
def execute_query_db(args: dict, agent_context: AgentContext):
handle = args["credential_handle"]
# Resolve handle → real credential from vault
creds = vault.get_secret(handle)
# Enforce per-agent policy
if not policy.is_allowed(agent_context.agent_id, handle):
raise PermissionError(f"Agent not authorized for {handle}")
# Execute with real credentials
conn = connect(host=creds["host"], password=creds["password"])
result = conn.execute(args["query"])
# Return sanitized result — no connection metadata
return {"rows": result.fetchall()}
The LLM sends {"credential_handle": "DB_PROD", "query": "SELECT count(*) FROM users"}. The runtime resolves DB_PROD to real credentials, executes, and returns only the query results. The password never appears in any LLM-visible context.
The handle itself is not the security boundary. It is just a label. The boundary is the policy check in the runtime:
| Question | Example policy |
|---|---|
| Which agent is calling? | agent_id must be assigned to the project |
| Which handle is requested? | DB_STAGING_READONLY allowed, DB_PROD_ADMIN denied |
| What action is requested? | SELECT allowed, DROP denied |
| What data may return? | Result rows allowed, connection metadata redacted |
| Does this need approval? | Production writes require human confirmation |
Without that policy layer, handles become security theater. The LLM cannot see the password, but it can still ask the runtime to misuse it.
Step 3: Scope and Short-Live Everything
Don't give agents long-lived root credentials. Apply the principle of least privilege:
- Mint per-task credentials. When an agent session starts, issue a short-lived token scoped to exactly what that task needs. Revoke it when the task completes.
- Use scoped tokens. Instead of a full AWS root key, issue an STS token with a 15-minute TTL scoped to a single S3 bucket.
- Separate read and write. If an agent only needs to query data, the credential should not permit writes.
For human-operated keys outside the agent runtime, pair this with a standard API key security best practices process for storage, rotation, and exposure prevention.
# Issue a scoped, short-lived credential for each agent task
task_creds = vault.issue_temporary(
base_secret="DB_PROD",
scope="read_only",
ttl_minutes=15,
allowed_tables=["users", "orders"]
)
Multi-Agent Delegation: The Credential Propagation Problem
Things get harder when agents spawn sub-agents. If Agent A can access DB_PROD and it delegates a subtask to Agent B, should Agent B inherit that access?
No. Never propagate credentials across agent boundaries.
Use a broker pattern: a trusted orchestrator holds credentials and executes privileged actions on behalf of sub-agents. Sub-agents request actions through the broker, which applies its own policy checks.
Agent A (orchestrator)
│
├── Sub-Agent B: "summarize user data"
│ └── Calls broker: query_db(DB_PROD, "SELECT ...")
│ └── Broker checks: is B allowed? → yes (read-only)
│
└── Sub-Agent C: "send report email"
└── Calls broker: send_email(SMTP_HANDLE, ...)
└── Broker checks: is C allowed? → yes (send only)
Each sub-agent authenticates independently with the broker. No credential is ever passed from parent to child.
Audit Everything, Redact the Secrets
Every tool invocation should be logged for audit:
{
"timestamp": "2026-02-12T10:30:00Z",
"agent_id": "agent-abc123",
"tool": "query_db",
"credential_handle": "DB_PROD",
"query": "SELECT count(*) FROM users",
"result_rows": 1,
"status": "success"
}
Notice what's logged: the handle name, the query, the result count, but never the actual credential. Your audit trail tells you exactly what happened and who did it, without becoming a credential dump.
When Agents Need to Share Secrets with Humans
Sometimes the output of an agentic workflow is a credential itself. Maybe your agent provisioned a new database and needs to hand the connection string to a developer. Or it generated an API key for a new service.
This is where ephemeral, one-time secret sharing comes in. Instead of having the agent print the credential in chat or write it to a log, it can create a self-destructing link through a local client or trusted delivery service:
import subprocess
def deliver_credential_to_human(secret: str, recipient: str):
# Local CLI encrypts before upload and returns a one-time link.
result = subprocess.run(
["snappwd", "put"],
input=secret,
text=True,
capture_output=True,
check=True,
)
link = result.stdout.strip()
notify(recipient, f"Your credentials are ready: {link}")
The credential is encrypted before upload, stored ephemerally, and destroyed after the first access. No plaintext in Slack. No passwords in email threads. The agent response contains only a delivery link.
For especially sensitive outputs, add a policy gate before delivery:
- A human must approve the link creation.
- The link expires unread after 15 to 60 minutes.
- A passphrase is sent through a different channel.
- The credential is rotated after the recipient stores it.
The Checklist
If you're building an agentic system that uses credentials, run through this:
- No secrets in prompts. LLMs never see real credentials, only opaque handles.
- Vault-backed resolution. A trusted runtime resolves handles to real credentials at execution time.
- Per-agent policy enforcement. Each agent is authorized for specific credential handles, not all of them.
- Short-lived, scoped tokens. Credentials expire and are limited to the minimum required permissions.
- No credential propagation. Sub-agents authenticate independently through a broker.
- Audit logging with redaction. Log every tool call and credential handle usage, but never log the credential itself.
- Ephemeral delivery for human handoff. Use one-time links when agents need to pass secrets to people.
- Human approval for high-impact actions. Production writes, key creation, and credential export should not be fully autonomous by default.
- Output filtering. Scan tool results for tokens, private keys, connection strings, and authorization headers before they enter the LLM context.
The Bottom Line
The LLM is an untrusted caller. Treat it like a frontend browser: it can request actions, but it never holds the keys to the database. Your tool runtime is the backend that resolves credentials, enforces policy, and executes privileged operations.
Get this boundary right and your agentic system can do powerful things without becoming your biggest security liability.
Read Next
Do Link Scanners Burn One-Time Secret Links?
Learn how link previews, Safe Links, and automated URL scanning can consume one-time secret links, with test steps and defensive guidance.
One-Time Secret Security Benchmark: 9 Tools Tested (2026)
A source-backed benchmark of one-time secret tools covering encryption, expiration controls, accounts, file support, self-hosting, and web hardening.