Securing AI Agent Interactions: Why Your Healthcare AI Needs Token Delegation, Not Just Authentication

AI Security Series #33

When a human user authenticates to your electronic health record system, they prove who they are, and the system grants access based on their role and permissions. The security model is well-understood: identity verification, role-based access control, audit logging. But when an AI agent authenticates on behalf of that user to access patient data, extract information, and make recommendations—how does the downstream system know the agent is authorized to act for that user? How do you prevent credential replay attacks where a compromised agent token gets used by an attacker? How do you ensure the agent can only access exactly what it needs for its specific task, not everything the user could access?

Traditional authentication models break down with agentic AI because they were designed for direct human-to-system interactions, not for chains of delegation where AI agents operate on behalf of users through multiple system hops. IBM's Grant Miller recently published guidance on securing agentic AI flows that healthcare organizations should treat as required reading. The technical patterns he describes—token delegation, actor-plus-subject credentials, token exchange at each hop, and last-mile secret vaulting—aren't just best practices. They're the minimum security architecture needed to deploy agentic AI systems in healthcare environments that handle protected health information and must maintain proper audit trails.

The Four Core Risks in Agentic AI Flows

Agentic AI systems introduce security challenges that don't exist in traditional application architectures. Understanding these risks requires recognizing that AI agents aren't just another API client—they're autonomous actors making decisions, accessing resources, and taking actions on behalf of users in ways that may not be fully deterministic or predictable.

Credential Replay Attacks

In a credential replay attack, an attacker intercepts or extracts a legitimate authentication token and uses it to impersonate the original user or agent. This risk is amplified in agentic systems because tokens often flow through multiple components—from the user's session to the orchestration layer, through the AI model's processing, potentially into the model's context window, and finally to downstream services.

The most insidious variant occurs when tokens leak through LLM prompts. An agent might receive a user token, process it alongside other context, and inadvertently include it in a prompt sent to a language model. If that prompt gets logged, cached, or used for model training, the token becomes exposed. An attacker who gains access to prompt logs or model training data can extract those tokens and replay them against protected systems.

Healthcare environments make this particularly dangerous because a single compromised token might grant access to thousands of patient records. A physician's authentication token, replayed by an attacker, could pull entire patient databases if the downstream systems don't distinguish between the legitimate physician accessing records through approved applications and an attacker replaying that physician's credentials through unauthorized channels.

Rogue Agents

A rogue agent attack involves an unauthorized agent attempting to spoof a legitimate agent's identity to gain system access. This could be a malicious agent injected into the system by an attacker, or a legitimate agent that's been compromised and repurposed for unauthorized activities.

The challenge is that many current architectures treat agents as trusted components once they're deployed. If Agent A is authorized to access patient scheduling systems and Agent B is authorized to access clinical documentation, what prevents a compromised Agent A from claiming to be Agent B and gaining access to clinical notes? Without robust agent authentication and verification at each interaction point, the system has no way to distinguish between legitimate agent behavior and spoofed identity claims.

Healthcare organizations running multiple AI agents for different workflows—one for prior authorization, another for clinical documentation, another for patient communication—need mechanisms to ensure each agent can only access systems appropriate to its designated function. Agent identity must be verifiable, persistent across the agent's lifecycle, and validated at every system boundary.

Impersonation Through Unauthorized Delegation

Impersonation occurs when an agent attempts to act on behalf of a user without verifiable proof that the user actually delegated that authority. This is subtly different from credential replay—the agent might have its own legitimate credentials, but it's claiming to act for a user who never authorized that action.

Consider a scenario where an AI agent processes a request from a healthcare administrator to pull quality metrics across all providers. The agent has legitimate credentials to access the analytics system. But when it starts pulling individual patient records to compute those metrics, is it actually authorized to act on behalf of the administrator for that purpose? Did the administrator explicitly delegate patient data access, or is the agent assuming that authority based on its interpretation of the request?

Without formal delegation mechanisms, systems have no way to verify that an agent's claim to be acting on a user's behalf is legitimate. This creates both security risks—unauthorized data access—and compliance risks. HIPAA requires tracking who accessed patient data and for what purpose. If an agent accesses records claiming to act for a user who never authorized that access, the audit trail is fundamentally broken.

Over-Permissioning

Over-permissioning means granting an agent more access than necessary for its specific task. This violates the principle of least privilege and creates unnecessary blast radius if the agent is compromised or behaves unexpectedly.

A common pattern in current agentic implementations is to grant the agent the full permissions of the user on whose behalf it's acting. If a physician user has access to all patient records in their department, the AI agent operating for that physician gets the same access. But the agent's specific task—perhaps summarizing today's patient encounters—only requires access to a small subset of those records.

Healthcare organizations already struggle with over-permissioning in traditional access control systems. Agentic AI amplifies the problem because agents often need to traverse multiple systems, and it's tempting to grant broad permissions rather than carefully scoping access for each task. But that approach means a compromised agent or an agent that behaves unexpectedly due to a prompt injection attack can access far more data than necessary for its intended function.

Identity and Authentication: Establishing Who's Who

Securing agentic flows starts with robust identity for both users and agents. Every actor in the system—human users, AI agents, and downstream services—needs a verifiable identity that can be authenticated at each interaction point.

For users, this typically means integration with an enterprise identity provider using protocols like OAuth 2.0 or SAML. Healthcare organizations likely already have these systems in place for clinician and administrative access. The key is ensuring that user identity flows through the entire agentic chain, not just the initial authentication to the user-facing application.

Agent identity is where many current implementations fall short. Agents need their own identity credentials separate from user credentials. An AI agent should authenticate to systems using its own identity, not by borrowing or replaying user tokens. This means provisioning unique identities for each agent, managing those identities throughout the agent's lifecycle, and revoking them when agents are decommissioned or compromised.

In healthcare contexts, agent identity becomes part of the audit trail. When reviewing who accessed a patient record, you need to know not just that "Dr. Smith accessed this record" but that "Dr. Smith accessed this record through the Clinical Summary Agent, which was operating under specific delegation rules for patient encounter summaries." Both the user identity and the agent identity matter for compliance and security monitoring.

The technical implementation typically involves issuing each agent a client credential—a client ID and secret or a certificate-based credential—that it uses to authenticate to an authorization server. That authentication proves the agent's identity independently of any user. Then, through delegation mechanisms, the agent can demonstrate that it's authorized to act on behalf of a specific user for a specific purpose.

Delegation: Proving Authority to Act on Behalf of Users

Delegation is the mechanism by which a user authorizes an agent to act on their behalf. It's not enough for an agent to have credentials—it needs proof that the user granted it authority to use those credentials in a specific context.

OAuth 2.0 provides delegation patterns through mechanisms like the authorization code flow or the token exchange specification (RFC 8693). When properly implemented, these patterns produce tokens that represent both the subject (the user) and the actor (the agent). A downstream service receiving such a token can verify both that the user authorized the access and that the specific agent is legitimately acting on the user's behalf.

In practical terms, this means tokens should contain claims identifying both the user and the agent. A token might indicate "subject: dr.smith@hospital.org, actor: clinical-summary-agent, scope: read:patient-encounters." That token proves Dr. Smith authorized the clinical summary agent to read patient encounters, and it proves the clinical summary agent is the entity presenting the token. If a different agent tries to use that token, the actor claim won't match, and the access should be rejected.

For healthcare organizations, delegation tokens solve the audit trail problem. When you review access logs, you can see exactly which agent accessed data, on whose behalf, and for what purpose. This granularity is essential for HIPAA compliance and for investigating potential security incidents or privacy violations.

Delegation also enables user control and transparency. Users can see which agents are authorized to act on their behalf, revoke those authorizations, and understand what actions agents have taken in their name. This is particularly important in healthcare, where clinicians are personally responsible for actions taken under their credentials, even if those actions were technically performed by an AI agent.

Token Exchange at Each Hop

As an agentic flow moves through multiple systems—from the orchestration layer to the AI model, from the model to downstream services—tokens should be exchanged at each hop rather than simply forwarded. Token exchange means trading an incoming token for a new outgoing token that's specifically scoped for the next step in the flow.

This pattern prevents several attacks. If a token is compromised at one stage of the flow, it's only valid for that specific stage, not for upstream or downstream systems. An attacker who intercepts a token between the agent and a downstream API can't use that token to access the orchestration layer or other services. The token is scoped only for the specific API interaction where it was captured.

Token exchange also enforces least privilege at each stage. When the clinical summary agent needs to access patient scheduling data, it exchanges its broad authorization token for a narrowly scoped token that only permits reading today's schedule for a specific provider. That scoped token is what gets sent to the scheduling API. If the scheduling API is compromised or if the agent's prompt is manipulated to attempt unauthorized access, the scoped token limits what can be reached.

Implementing token exchange requires coordination between the orchestration layer, the AI agents, and downstream services. Each component needs to understand token exchange protocols and have access to an authorization server that can issue appropriately scoped tokens. This adds complexity compared to simply forwarding tokens through the system, but it's essential for security in multi-hop agentic flows.

For healthcare organizations, token exchange creates clear security boundaries. Each system hop is a point where authorization is re-evaluated, scopes are narrowed, and access decisions are made. This provides multiple opportunities to detect and block unauthorized access attempts rather than relying on a single authentication decision at the entry point that grants broad access throughout the entire flow.

Scope Restriction: Implementing Least Privilege for Agents

Scope restriction means limiting each token to the specific permissions needed for its intended use, nothing more. This is the technical implementation of least privilege principle in agentic systems. Instead of granting an agent "full access to all patient records," you grant "read access to patient encounter summaries for the current shift, limited to patients assigned to the requesting provider."

OAuth 2.0 scopes provide the mechanism for this. A scope is a string that defines a permission boundary—for example, "read:encounters," "write:clinical-notes," or "admin:user-management." When an agent requests a token, it specifies the scopes it needs. The authorization server evaluates whether the agent is allowed to request those scopes given the user's permissions and the agent's designated function. If approved, the token includes only those specific scopes.

Downstream services then enforce scope restrictions. When the agent presents its token to access patient encounters, the API checks that the token contains the "read:encounters" scope before granting access. If the agent attempts to write clinical notes using a token scoped only for reading, the API rejects the request even though the token is otherwise valid.

Healthcare organizations should define granular scopes that align with clinical workflows and regulatory requirements. Scopes might distinguish between reading encounter summaries versus reading full clinical notes, between accessing current patients versus historical records, or between viewing data for analysis versus exporting data for external use. The more granular the scopes, the more precisely you can limit agent permissions to match their specific functions.

Scope restriction also provides a natural point for business logic and policy enforcement. Before issuing a token with specific scopes, the authorization server can check organizational policies, patient consent directives, or regulatory requirements. A request for a scope that would violate a patient's consent restrictions or an organizational policy can be denied at token issuance time, before the agent ever attempts to access the data.

Secure Communication: Protecting Tokens in Transit

Even with sophisticated token delegation and scoping, tokens can be compromised if they're transmitted over insecure channels. TLS (Transport Layer Security) is the baseline requirement—all communication between agents, orchestration layers, AI models, and downstream services must occur over encrypted channels.

For particularly sensitive environments, mutual TLS (MTLS) provides additional security. In MTLS, both the client and the server present certificates and verify each other's identity before establishing the connection. This prevents man-in-the-middle attacks where an attacker interposes themselves between the agent and a downstream service, intercepting tokens as they flow through.

Healthcare organizations should treat MTLS as required rather than optional for agentic AI systems accessing protected health information. The additional complexity of certificate management and distribution is justified by the security benefit, particularly given the high value of healthcare data and the regulatory penalties for breaches.

Secure communication also means protecting tokens at rest. Any component that temporarily stores tokens—whether in memory, on disk, or in a cache—must encrypt those tokens. An attacker who gains filesystem access to an orchestration server or agent runtime should not be able to extract usable tokens from memory dumps or temporary files.

In practice, this means using encrypted storage for any token caching, implementing secure memory management practices in agent code, and ensuring that logging and debugging output doesn't accidentally capture tokens. Healthcare organizations should audit their agent implementations specifically for token handling—reviewing how tokens are received, stored, used, and eventually discarded, and ensuring each stage follows secure practices.

Last-Mile Security: Vault-Based Credential Management

The last mile of agentic security involves how agents access credentials for the tools and services they invoke. The tempting but insecure approach is to embed long-lived API keys, database passwords, or service account credentials directly in the agent's configuration or in MCP server code. This creates multiple problems: credentials become difficult to rotate, they're exposed to anyone who can read the configuration, and they persist longer than necessary for the agent's task.

The secure alternative is vault-based credential management. A credential vault—such as HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault—stores sensitive credentials and provides temporary access through short-lived tokens or dynamic credential generation. When an agent needs to access a downstream tool, it requests temporary credentials from the vault rather than using long-lived static credentials.

This pattern dramatically reduces exposure. If an agent or MCP server is compromised, the attacker gains access only to the temporary credentials active at that moment, not to long-lived secrets that could be used for persistent access. When the temporary credentials expire—typically within minutes or hours—the attacker's access is cut off even if the compromise persists.

Vault-based management also simplifies credential rotation. Instead of updating credentials in dozens of agent configurations and redeploying agents when keys need rotation, you update credentials in the vault and agents automatically receive new credentials on their next request. This makes security best practices like regular credential rotation actually feasible in production environments.

For healthcare organizations, vault-based credential management provides additional audit benefits. Every credential request is logged in the vault, creating a detailed audit trail of which agents accessed which credentials and when. This audit data can be correlated with downstream access logs to verify that credential usage matches expected patterns and to investigate anomalies.

Implementing vault-based credential management requires integrating agent code with vault APIs and designing workflows that request credentials dynamically rather than relying on static configuration. Healthcare organizations may need to update or wrap existing MCP servers to support vault integration if they don't already. The implementation effort is justified by the security improvement, particularly for agents that access particularly sensitive data like clinical systems or patient databases.

Healthcare-Specific Implementation Considerations

Healthcare environments impose additional requirements beyond general agentic security best practices. HIPAA's minimum necessary standard means access should be limited not just by role but by specific clinical need. An agent summarizing patient encounters should access only the encounters it's summarizing, not all encounters for those patients or all patients in the department.

This requires fine-grained scoping beyond what many general-purpose OAuth implementations provide out of the box. Healthcare organizations may need to extend standard OAuth scopes with custom claims that capture patient identifiers, date ranges, or specific data elements. A scope might look like "read:encounters:patient-12345:date-range-2026-04-01-to-2026-04-07" rather than simply "read:encounters."

Break-the-glass scenarios—where clinicians need emergency access outside normal authorization rules—create additional complexity for delegated access. If a clinician invokes emergency access to view a patient record, should AI agents acting on their behalf inherit that emergency access? Healthcare organizations need clear policies on how emergency access propagates to agents and how it's logged and reviewed.

Patient consent directives add another layer. If a patient has restricted access to certain portions of their record, those restrictions must be enforced when agents access data on behalf of clinicians. The authorization system needs to check not just the clinician's role and the agent's scopes but also the specific patient's consent status before issuing tokens or allowing access.

Integration with existing healthcare identity systems—Active Directory, LDAP, or specialized healthcare identity providers—requires careful planning. Agents need identities that can be provisioned, managed, and decommissioned through the same processes that govern clinician and administrative accounts. Healthcare IT teams should extend their identity lifecycle management practices to cover agent identities rather than treating agents as a separate, parallel identity system.

Monitoring and Incident Response for Agentic Systems

Even with robust authentication, delegation, and credential management, healthcare organizations need monitoring systems designed specifically for agentic behavior. Traditional security monitoring focuses on human users accessing systems during business hours from expected locations. Agents operate differently—they make many rapid requests, work 24/7, and access multiple systems in quick succession as they complete tasks.

Monitoring should track both agent behavior and delegation patterns. Anomaly detection should flag agents that suddenly access unusual data, that attempt actions outside their designated scopes, or that exhibit behavior inconsistent with their historical patterns. Delegation monitoring should alert when new delegation relationships are established, when scopes are broadened beyond typical values, or when tokens are used from unexpected network locations or for longer than typical task durations.

Healthcare organizations should establish baselines for normal agent behavior—what systems each agent typically accesses, how much data it processes, how long tasks usually take, what error rates are typical. Deviations from these baselines trigger investigation. An agent that suddenly starts accessing ten times more patient records than usual, or that begins attempting operations outside its designated scopes, may be compromised or responding to a prompt injection attack.

Incident response procedures need updates to account for agentic systems. When a potential security incident involves an agent, response teams need to quickly identify which users delegated authority to that agent, what data the agent accessed, what downstream systems it touched, and what temporary credentials were issued during the incident window. This requires correlation across multiple log sources—authorization server logs, agent operational logs, downstream API access logs, and vault access logs.

Healthcare organizations should also plan for agent revocation scenarios. If an agent is compromised or behaves unexpectedly, you need the ability to immediately revoke all its credentials, invalidate its outstanding tokens, and remove its delegation authorizations without affecting other agents or disrupting ongoing clinical workflows. This requires centralized agent identity management and real-time token revocation capabilities.

Common Implementation Mistakes to Avoid

Healthcare organizations deploying agentic AI systems should watch for several common security mistakes that undermine even well-designed architectures:

Treating agent credentials as static configuration: Embedding API keys or service account passwords in agent code or configuration files creates long-lived credentials that are difficult to rotate and easily exposed. Every agent credential should have a lifecycle—issued when needed, rotated regularly, revoked when no longer required.

Forwarding user tokens directly to downstream services: When an agent receives a user token, it should exchange that token for an agent-specific token rather than forwarding the user token directly. Direct forwarding means downstream services can't distinguish between the user accessing directly and the agent acting on the user's behalf, breaking audit trails and preventing per-agent policy enforcement.

Using overly broad scopes to avoid integration complexity: It's tempting to grant agents broad "read all" or "write all" scopes rather than carefully defining granular scopes for specific operations. This creates unnecessary security risk. If scope definition seems too complex, that's often a sign that the agent's responsibilities aren't clearly defined. Fix the design rather than working around it with broad permissions.

Failing to implement token expiration and rotation: Long-lived tokens are security liabilities. Every token should have an expiration time appropriate to its use case—minutes for narrowly scoped operational tokens, hours at most for broader authorization tokens. Healthcare organizations should never issue tokens that don't expire or that expire only after days or weeks.

Assuming agents are trusted once deployed: Agents can be compromised through supply chain attacks, configuration errors, or runtime exploitation. Every agent interaction should be authenticated and authorized as if the agent might be compromised. Trust but verify doesn't apply—verify every time.

Neglecting audit logging for agent actions: Healthcare regulations require detailed audit trails for access to protected health information. Agent actions must be logged with sufficient detail to reconstruct who accessed what data, on whose behalf, for what purpose, and with what outcome. Logging the agent's actions without logging the user delegation context creates incomplete audit trails that may not satisfy regulatory requirements.

Building Toward Zero-Trust Agent Architectures

The security patterns IBM describes—token delegation, actor-plus-subject credentials, token exchange, scope restriction, and vault-based secrets—are components of a broader zero-trust architecture adapted for agentic systems. Zero trust means never assuming that being inside the network perimeter or having authenticated once grants ongoing trust. Every request is authenticated, every access decision is made based on current context, and trust is continually re-evaluated.

For agentic systems in healthcare, zero-trust principles mean that even if an agent successfully completed a task thirty seconds ago, its next request is treated with the same scrutiny. The agent must re-authenticate, provide current delegation proof, and demonstrate that its requested access aligns with its designated function and the user's current authorization. Previous successful access doesn't create implicit trust for future requests.

Implementing zero-trust for agents requires moving beyond perimeter security—firewalls, VPNs, network segmentation—toward continuous verification at every system interaction. This is particularly important for healthcare organizations that may run agents in cloud environments, hybrid architectures, or as part of vendor-provided services where traditional network perimeter controls don't apply.

Zero-trust agent architectures also need microsegmentation—isolating agents from each other and from other system components so that a compromise in one agent doesn't automatically grant access to others. In practice, this means running agents in separate containers or VMs with minimal network connectivity, implementing strict egress filtering so agents can only reach explicitly authorized services, and using service mesh architectures that enforce authentication and authorization at the network layer independent of application-level controls.

The Path Forward: Making Secure Agent Architecture the Default

The security patterns IBM describes should become the default architecture for any agentic AI system deployed in healthcare, not optional enhancements or aspirational best practices. Healthcare organizations have both regulatory obligations and ethical responsibilities to protect patient data and ensure that AI agents operating on behalf of clinicians maintain the same security and accountability standards as direct human access.

This requires investment in infrastructure—authorization servers, credential vaults, token exchange services, monitoring and logging systems designed for agentic behavior. It also requires updates to security policies, incident response procedures, and compliance frameworks to account for delegation relationships and agent identity.

Healthcare IT teams should start by auditing existing AI agent deployments against these security principles. How are agents currently authenticating? Do they have their own identities or are they borrowing user credentials? Are delegation relationships explicit and verifiable, or are agents simply trusted to act appropriately? Are tokens scoped to specific operations or do they grant broad access? Are credentials managed securely or embedded in configuration?

For most healthcare organizations, this audit will reveal gaps. The work then becomes systematically closing those gaps—implementing proper agent identity management, adding delegation mechanisms, refining scope definitions, integrating with credential vaults, and updating monitoring to track agent behavior patterns. This isn't quick work, but it's necessary work if agentic AI systems are going to meet healthcare security and compliance requirements.

The alternative—deploying agents without these security controls—creates unacceptable risk. A compromised agent with broad access to clinical systems could extract massive amounts of protected health information, modify clinical records, or disrupt patient care. The regulatory penalties for such breaches are severe, but the patient harm and loss of trust are potentially worse. Healthcare organizations have an obligation to implement security architecture that prevents these scenarios rather than hoping they won't occur.

IBM's guidance provides a roadmap. The patterns they describe aren't theoretical—they're based on proven security protocols like OAuth 2.0, established practices like credential vaulting, and architectural principles like least privilege and zero trust that have demonstrated effectiveness in other high-security contexts. Adapting these patterns to agentic AI systems in healthcare is feasible. It requires thoughtful implementation and organizational commitment, but the security outcomes justify the effort.

Healthcare organizations deploying AI agents should treat these security patterns as requirements, not recommendations. Agent authentication, delegation tokens, scope restriction, token exchange, secure communication, and vault-based credential management aren't optional security layers you add if you have extra budget or time. They're the minimum architecture needed to responsibly deploy autonomous systems that access protected health information and make decisions affecting patient care.

This is entry #33 in the AI Security series. For related coverage, see Context Engineering for Agentic AI and OWASP Top 10 for AI Agents.

Key Links

IBM Technology: Securing AI Agents with Tokens and Delegation

OAuth 2.0 Token Exchange (RFC 8693)

OAuth 2.0 Framework