Comment-and-Control: GitHub-Integrated AI Agents Vulnerable to Credential Theft

A new class of prompt injection attack targeting AI agents integrated with GitHub Actions can steal API keys and access tokens without requiring external command-and-control infrastructure. Security researchers from Johns Hopkins University successfully hijacked Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and Microsoft's GitHub Copilot Agent—receiving bug bounties from all three vendors—but none issued public advisories or assigned CVEs, leaving users pinned to vulnerable versions potentially unaware of the risk.

The Attack Pattern

Researcher Aonan Guan discovered that AI agents integrated with GitHub Actions follow a predictable flow: they read GitHub data (pull request titles, issue bodies, and comments), process it as part of task context, and then take actions based on those inputs. By injecting malicious instructions into this data stream, attackers can hijack agent behavior and exfiltrate credentials through GitHub itself.

The attack operates entirely within GitHub's infrastructure. An attacker submits a pull request with malicious instructions embedded in the PR title, such as directing Claude Code Security Review to execute a whoami command using its bash tool and return the results as a "security finding." When the GitHub Action triggers automatically, the AI agent processes the malicious instruction as legitimate context and executes it, posting the output in a PR comment. The attacker reads the comment containing leaked credentials, edits the PR title back to something benign like "fix typo," closes the PR, and deletes the bot's message—erasing evidence of the attack.

Vendor Responses and Disclosure Gaps

Anthropic paid Guan a $100 bounty in November 2025 after he demonstrated credential theft via Claude Code Security Review (CVSS score upgraded from 9.3 to 9.4). The company updated its documentation with a warning that the action "is not hardened against prompt injection attacks and should only be used to review trusted PRs," recommending users enable GitHub's "Require approval for all external contributors" setting.

Google paid a $1,337 bounty for a similar vulnerability in Gemini CLI Action, which the researchers exploited by injecting a fake "Trusted Content Section" that overrode Gemini's safety instructions, causing it to post its API key as an issue comment. The team received credits including Guan, Neil Fendley, Zhengyu Liu, Senapati Diwangkara, and Yinzhi Cao.

Microsoft's GitHub Copilot Agent proved the most challenging target, requiring an HTML comment (invisible in rendered Markdown) containing malicious instructions. After initially dismissing it as a "known issue" they "were unable to reproduce," GitHub paid a $500 bounty in March 2026. The attack bypassed three defense layers: environment filtering, secret scanning, and network firewalls.

The critical issue: none of the vendors published public security advisories or assigned CVE identifiers. "If they don't publish an advisory, those users may never know they are vulnerable—or under attack," Guan told The Register in an exclusive interview. Users pinned to specific versions of these GitHub Actions have no mechanism to learn of the vulnerability unless they happen to review updated documentation.

Technical Mechanics

The researchers dubbed this attack pattern "comment-and-control"—a play on "command-and-control" that reflects how the entire operation runs inside GitHub without requiring external C2 infrastructure. The attack differs critically from classic indirect prompt injection, which is reactive (attacker plants payload and waits for victim to process it). Comment-and-control is proactive: GitHub Actions workflows fire automatically on PR or issue events, meaning simply opening a PR or filing an issue can trigger the AI agent without any victim interaction.

The Copilot attack represents a partial exception: a victim must assign the issue to Copilot, but because the malicious instructions hide inside an HTML comment, the assignment happens without the victim ever seeing the payload. This demonstrates that even agents with prompt-injection prevention built into the model can be compromised through creative attack vectors.

Credentials exposed through these attacks include Anthropic and Gemini API keys, GitHub access tokens, and any repository or organization secrets accessible to the GitHub Actions runner environment. In healthcare development contexts, this could include API keys for FHIR servers, database credentials with PHI access, cloud infrastructure credentials (AWS, Azure, GCP), or third-party vendor API tokens for EHR integration platforms.

Healthcare Development Implications

Healthcare organizations increasingly use AI-assisted development tools for building EHR integrations, FHIR API implementations, HL7 interface testing, and medical device firmware. These workflows often involve GitHub Actions with access to sensitive credentials that could facilitate HIPAA breaches if stolen.

Consider a healthcare software team using Claude Code Security Review to automatically analyze pull requests for a patient portal codebase. An external contributor submits a PR with malicious instructions injected into the title. The GitHub Action fires automatically, Claude processes the instruction, and API keys for the production FHIR server get posted in a PR comment. The attacker harvests the credentials, edits the PR title, closes the PR, and deletes the comment—all before the development team's morning standup. The stolen FHIR server credentials now provide unauthorized access to patient health records.

The supply chain risk extends beyond direct organizational repositories. Open-source healthcare libraries commonly used across the industry (FHIR parsing libraries, HL7 interface code, DICOM processing tools) accept external contributions. A compromised dependency used by dozens of healthcare organizations creates a multiplier effect for credential theft and potential PHI exposure.

Regulatory compliance adds another layer of concern. HIPAA breach notification requirements trigger if credentials with PHI access are stolen, even if no evidence exists of actual PHI exfiltration. Organizations must conduct forensic analysis to determine the scope of potential access, document the timeline, and potentially notify affected individuals and HHS. Audit trail requirements mean development teams must maintain detailed logs of who accessed what systems and when—but GitHub Actions logs may not capture credential theft via AI agent prompt injection in sufficient detail to satisfy compliance requirements.

Defensive Measures

The researchers recommend treating prompt injection as "phishing for machines" and applying need-to-know principles to AI agents. If a code review agent doesn't require bash execution, don't provide that tool. Use allowlists to restrict agent access to only required resources. If the agent's job is summarizing issues, it doesn't need credentials for GitHub write access.

For GitHub Actions specifically, enable "Require approval for all external contributors" on all repositories to prevent automatic workflow execution on untrusted PRs. Review which secrets and tokens each workflow can access, implementing credential separation by sensitivity level (development vs. production, read-only vs. write access). Limit tool access in the GitHub Actions environment to the minimum required for each agent's specific function.
Healthcare organizations should conduct an immediate audit of which GitHub Actions use AI agents, catalog which credentials those workflows can access, and implement approval workflows for AI agent code reviews. Separate credentials by function and sensitivity, ensuring production FHIR server credentials aren't accessible to development environment workflows. Implement detective controls for credential exfiltration via GitHub, such as monitoring for unusual API access patterns or unexpected data retrieval from patient databases.

Longer-term measures include establishing organizational policy on AI agent usage in healthcare development, incorporating AI agent security into vendor risk assessments, training development teams on prompt injection attack patterns specific to healthcare contexts, and building incident response playbooks for credential theft scenarios involving AI agents.

Broader Attack Surface

Guan indicated the vulnerability pattern likely affects other agents integrated with GitHub beyond the three demonstrated exploits. Potential targets include Slack bots with GitHub integration that post PR summaries to channels, Jira agents that automatically create tickets from GitHub issues, email automation agents that send deployment notifications, and deployment automation agents with access to production infrastructure credentials.

The fundamental issue is architectural: AI agents process user-controlled data (PR titles, issue bodies, comments) as trusted context without clear separation between system instructions and untrusted input. Traditional input validation and sanitization approaches designed for SQL injection or XSS don't translate cleanly to natural language processing contexts where the distinction between "data" and "instruction" is fuzzy.

The absence of public advisories and CVE assignments means organizations relying on vulnerability scanning tools and security bulletins for patch management will miss these issues entirely. Development teams must proactively review GitHub Actions configurations and AI agent integrations rather than waiting for security advisories that may never arrive.