State-backed hackers are already using AI to accelerate their attack chains. Google's Threat Intelligence Group reported this week that North Korean group UNC2970 used Gemini to profile defense industry targets, while Iranian APT42 crafted tailored social engineering personas.
The most concerning finding? A malware family called HONESTCUE that calls the Gemini API to generate its own attack code at runtime—fileless, polymorphic, and invisible to traditional detection.
These aren’t theoretical threats. They’re production attacks happening right now. And they expose a fundamental gap in how we think about AI agent security: we’re verifying who agents are, but not what they’re trying to do.
The Identity Problem in Agentic AI
Traditional identity management was built for humans. A user authenticates, receives permissions, and operates within those boundaries. The system trusts that the authenticated user will act according to their intentions. This model breaks down with AI agents for a simple reason: agents don’t just execute instructions—they interpret them.When a clinician asks an AI assistant to “summarize this patient’s recent lab results,” the agent must decide which labs to include, how far back to look, what format to use, and what level of detail is appropriate. Each decision is an interpretation. And each interpretation creates an opportunity for the agent’s actions to drift from the user’s actual intent.
This interpretive gap is exactly what prompt injection exploits. An attacker doesn’t need to steal credentials or compromise the agent’s identity. They just need to influence the agent’s interpretation of what it should do. The agent remains “authenticated” while pursuing goals the user never intended.
Introducing Multi-Layered AI Identity
Securing AI agents requires identity verification at multiple layers, not just the agent itself. I’ve been developing a framework called Multi-Layered AI Identity that addresses this challenge through four distinct verification layers:- Agent Identity: Verify the agent instance itself—is this an authorized agent with valid, non-expired credentials from the identity provider?
- Tool Identity: Verify the tools being invoked—is this API or service in our approved registry, and does the agent have permission to use it for this purpose?
- Data Identity: Verify data provenance and classification—where did this data come from, what’s its sensitivity level, and does the agent have access rights?
- Intent Identity: Verify that the agent’s actions align with the user’s original request—is what the agent is doing consistent with what it was asked to do?
The first three layers are familiar territory. We know how to issue credentials, maintain registries, and classify data. But Intent Identity is different. It’s the layer the industry hasn’t fully addressed yet—and it’s the one that matters most for detecting prompt injection, goal drift, and behavioral manipulation.
Why Intent Identity Is Different
Here’s the key insight: the first three identity layers are checkpoint-based. You verify Agent Identity when the agent instantiates. You verify Tool Identity before each tool invocation. You verify Data Identity before accessing protected resources. These are discrete verification events.Intent Identity, by contrast, must be continuous. It runs throughout the agent’s entire execution loop—planning, execution, reflection, decision—validating at every iteration that the agent’s actions still align with the original user request.
This distinction matters because prompt injection attacks don’t compromise credentials or tools. They redirect the agent’s goals mid-execution. An agent that passed all checkpoint verifications at instantiation can still be manipulated by malicious content it encounters while processing a task. By the time it takes an unauthorized action, it’s already authenticated, its tools are approved, and its data access is legitimate. The only thing that changed was its intent.
What the GTIG Findings Tell Us
Google’s threat intelligence report provides concrete examples of why Intent Identity matters:HONESTCUE Malware: This malware calls the Gemini API with hardcoded prompts to generate C# code for its second stage, then compiles and executes it in memory. The prompt itself isn’t malicious—“Write a complete, self-contained C# program...”—but the intent is. Traditional content filtering won’t catch this. Intent verification—understanding what the code will actually do in context—might.
APT31’s Persona Fabrication: The Chinese threat actor prompted Gemini by claiming to be “a security researcher who is trialling out the hexstrike MCP tooling.” The stated intent (security research) masked the actual intent (reconnaissance for targeted attacks). Checkpoint-based identity verification sees a valid user with valid access. Intent verification might detect the mismatch between claimed purpose and query patterns.
Reconnaissance to Targeting Pipeline: GTIG observed threat actors using Gemini for OSINT synthesis and target profiling, then immediately using that research in phishing campaigns. The “routine professional research” and “malicious reconnaissance” are functionally identical queries—the difference is intent, which manifests in downstream actions.
Implementing Intent Identity
The industry is beginning to recognize this problem, though the solutions are still emerging. Several approaches are gaining traction:Intent Baseline Capture: At agent instantiation, capture a structured representation of the user’s original request—not just the text, but the parsed intent, expected actions, and acceptable scope. This becomes the baseline against which all subsequent actions are validated.
Execution Loop Monitoring: Observe the agent’s planning, execution, and reflection phases. Compare each planned action to the intent baseline. Flag deviations before they execute, not after.
Behavioral Drift Detection: Track patterns across the execution loop. Sudden changes in tool selection, data access patterns, or output format may indicate goal hijacking, even if each individual action passes checkpoint verification.
Intent Provenance Chain: Maintain an immutable link from every agent action back to the originating user request. This supports both real-time verification and post-incident forensics.
The common thread across these approaches is AI Observability—the ability to trace what an agent is doing, why it’s doing it, and whether that aligns with what it was asked to do. Without comprehensive observability, Intent Identity is impossible.
Practitioner Notes
Why Intent Identity Matters for Healthcare
In healthcare, the stakes for Intent Identity failures are exceptionally high. Consider an AI agent with access to EHR data:
- A clinician asks for “patients who might benefit from the new cardiac screening program”
- The agent interprets this as permission to query the entire patient database
- A prompt injection in a patient note redirects the query to export data externally
- The agent has valid credentials, authorized tools, and appropriate data access—but its intent has been hijacked
Intent Identity verification would catch this by comparing the export action against the original screening request and flagging the deviation.
HIPAA Implications
The HIPAA Security Rule’s audit control requirement (45 CFR § 164.312(b)) already mandates logging who accessed what. For AI agents, this must extend to why they accessed it. Intent provenance—linking every data access to an originating user request—isn’t just good security; it’s foundational for demonstrating minimum necessary compliance when agents have broad access.Vendor Evaluation Questions
When evaluating AI-enabled healthcare tools, ask about Intent Identity specifically:- How do you capture and preserve the original user intent?
- Can you demonstrate traceability from any agent action back to the user request that authorized it?
- What happens when agent behavior deviates from expected patterns?
- How quickly can you detect and halt a goal-hijacked agent?
- What observability data is available for incident investigation?
The Architecture Connection
Intent Identity doesn’t exist in isolation. It connects to the broader AI agent security architecture through several key components:AI Gateway: The gateway inspects inputs for prompt injection attempts, but it should also maintain the intent baseline and validate that agent responses align with it. This is where Intent Identity enforcement happens for external-facing agents.
Identity Provider + Secrets Vault: While these handle Agent Identity (credentials, authentication), they should also store the intent context associated with each agent session. When an agent requests elevated access, the IdP can verify whether that access aligns with the original intent.
Observability Layer: AI Observability is the infrastructure that makes Intent Identity possible. Without comprehensive traces of the execution loop—what the agent planned, what it executed, how it reflected—you can’t verify intent alignment. The observability layer is the Intent Identity layer.
What Comes Next
The Multi-Layered AI Identity framework, with Intent Identity as its continuous verification layer, provides a conceptual model for securing AI agents beyond traditional checkpoint-based approaches. But the implementation details matter enormously.In upcoming posts, I’ll explore the architecture in more depth: how the identity layers interact, where the control points sit, and what the agent lifecycle looks like from instantiation to decommissioning. I’ll also be releasing architectural diagrams that map these concepts to deployable patterns.
The threat landscape is evolving faster than our security models. State-backed actors are already using AI to accelerate attacks. The question isn’t whether we need Intent Identity verification—it’s how quickly we can implement it.
Want to Learn More?
Primary Sources
- GTIG AI Threat Tracker: Distillation, Experimentation, and (Continued) Integration of AI for Adversarial Use (February 2026)
- The Hacker News: Google Reports State-Backed Hackers Using Gemini AI for Recon and Attack Support
Industry Frameworks & Research
- OWASP Agentic Security Initiative (ASI) Threats Taxonomy
- Lakera AI: Agentic AI Threats - Memory Poisoning & Long-Horizon Goal Hijacks
- Partnership on AI: Prioritizing Real-Time Failure Detection in AI Agents (PDF)
- AWS Security Blog: The Agentic AI Security Scoping Matrix
- Palo Alto Networks: What Is Agentic AI Security?
- Microsoft Security Blog: From Runtime Risk to Real-Time Defense - Securing AI Agents
Intent Verification & Agent Identity
- Acuvity: The Agent Integrity Framework - The New Standard for Securing Autonomous AI
- Zenity: AI Detection and Response (AIDR) - Intent-Aware Detection
- DRIFT: Dynamic Rule-based Isolation Framework for Trustworthy Agentic Systems
- Dock.io: AI Agent Digital Identity Verification
- Okta: AI Agent Identity for Enterprise Security at Scale
Drift Detection & Continuous Monitoring
- Adopt AI: Agent Drift Detection
- Noma Security: AI Agent Risk Management - A Practical Guide
- Reco AI: The Rise of Agentic AI Security - Protecting Workflows, Not Just Apps
- Obsidian Security: Prompt Injection Attacks - The Most Common AI Exploit in 2025