The AI Gateway Everyone Uses Just Got Backdoored: LiteLLM and the Healthcare Supply Chain Risk

On March 24, 2026, LiteLLM — the Python library with 95 million monthly downloads that powers nearly every major AI agent framework — was compromised in a sophisticated supply chain attack. For three hours, two backdoored versions sat on PyPI ready to steal credentials, spread through Kubernetes clusters, and install persistent backdoors on any system that ran pip install. If your healthcare organization uses AI development tools, LLM orchestration frameworks, or AI agents, there's a significant chance LiteLLM was somewhere in your dependency tree, even if you never installed it directly.

The attack wasn't random. It was the calculated next step in a month-long campaign by the TeamPCP threat actor group, who systematically compromised Trivy vulnerability scanner, hijacked Checkmarx KICS, and used stolen credentials from each breach to unlock the next target. The healthcare implications are severe: LiteLLM routes API calls to Claude, GPT-4, Gemini, and other LLMs, which means the malware had access to the exact credentials healthcare organizations use to process patient data through AI systems.

What LiteLLM Is and Why It Matters

LiteLLM is an open-source gateway library that provides a unified API for routing requests across multiple LLM providers. Instead of writing separate integration code for OpenAI, Anthropic, Google, and Azure, developers use LiteLLM as a single interface that handles authentication, rate limiting, fallback logic, and cost tracking across all providers.

The library has become infrastructure for the AI development ecosystem. Major frameworks that depend on LiteLLM include CrewAI (multi-agent orchestration), Browser-Use (browser automation agents), Opik (LLM evaluation), DSPy (prompt optimization), Mem0 (agent memory systems), Instructor (structured output), Guardrails (LLM safety), and Camel-AI (autonomous agents). If your developers are building AI-powered tools for healthcare, they're almost certainly using one of these frameworks, which means LiteLLM is a transitive dependency even if it never appears in their requirements.txt file.

For healthcare organizations, this positioning makes LiteLLM a particularly valuable target. The library handles API keys for services that process PHI, manages authentication tokens for cloud resources, and operates in development environments where access to production systems is often just a credential rotation away. A backdoor in LiteLLM doesn't just compromise development machines — it provides attackers with the exact credentials needed to access patient data through AI systems.

The Attack Timeline: From Trivy to LiteLLM

Understanding the LiteLLM compromise requires following the full attack chain that TeamPCP executed over nearly a month.

February 28: An autonomous bot exploited a workflow vulnerability in Aqua Security's Trivy vulnerability scanner, stealing a Personal Access Token (PAT). Aqua remediated the surface-level damage but left residual access.

March 19: Attackers used the stolen Trivy credentials to rewrite Git tags in the trivy-action GitHub repository, pointing existing version tags to malicious releases containing credential harvesters.

March 23: The same infrastructure compromised Checkmarx KICS, registering the checkmarx.zone domain to impersonate the security company and exfiltrate more credentials.

March 24, 10:39 UTC: LiteLLM's CI/CD pipeline ran the compromised Trivy scanner as part of its build process. The malicious action exfiltrated the PYPI_PUBLISH token from the GitHub Actions environment.

March 24, 10:39-10:52 UTC: Using the stolen PyPI credentials, attackers published two backdoored versions (1.82.7 and 1.82.8) directly to PyPI, bypassing the normal GitHub release workflow. Neither version has corresponding git tags in the official repository.

March 24, ~14:00 UTC: Community members discovered the compromise when litellm was pulled in as a transitive dependency by an MCP plugin running in Cursor, causing RAM exhaustion. The malware's .pth injection mechanism created an accidental fork bomb that crashed developer machines.

March 24, ~16:00 UTC: PyPI administrators quarantined the entire litellm package, blocking all downloads and removing the malicious versions. The compromised versions were available for approximately three to four hours.

The attack demonstrates a critical failure mode in supply chain security: each compromised system yielded credentials that unlocked the next target. The initial Trivy breach occurred on February 28, but incomplete incident response left residual access that enabled the entire month-long campaign.

The Three-Stage Attack: What the Malware Did

The backdoored LiteLLM versions deployed a sophisticated multi-stage payload that operated in three phases.

Stage 1: Credential Harvesting

The malware systematically collected sensitive data from infected systems, targeting SSH private keys and configs, AWS/GCP/Azure credentials, Kubernetes service account tokens and kubeconfig files, database passwords and connection strings, .env files containing API keys, Git credentials and commit history, cryptocurrency wallet files, shell history, and cloud metadata service credentials.

The harvester also executed commands to dump environment variables and query cloud instance metadata endpoints (IMDS on AWS, container credential services on GCP/Azure). For healthcare organizations running AI workloads in cloud environments, this meant the malware had access to credentials that could reach EHR integration APIs, FHIR servers, patient data storage, and any other cloud resources accessible from the development environment.

Stage 2: Kubernetes Lateral Movement

If a Kubernetes service account token was present, the malware attempted aggressive lateral movement by reading all cluster secrets across all namespaces, deploying privileged alpine:latest pods to every node in the kube-system namespace, and mounting the host filesystem on each pod to install persistent backdoors.

The Kubernetes propagation is completely dependent on finding automatically mounted service account tokens, which Kubernetes mounts into every pod by default unless explicitly disabled with automountServiceAccountToken: false. Most workloads don't need these tokens, but their presence in the environment provides attackers with cluster-wide access.

For healthcare organizations running AI services in Kubernetes, this lateral movement could provide access to PHI processing pipelines, machine learning training data, model serving infrastructure, and monitoring systems containing patient information. The privileged pod deployment to every node represents a cluster-wide compromise from a single infected developer machine.

Stage 3: Persistence

The malware installed a systemd user service named "System Telemetry Service" at ~/.config/systemd/user/sysmon.service with a Python backdoor at ~/.config/sysmon/sysmon.py. This backdoor periodically polls a command-and-control server to fetch and execute additional payloads, maintaining long-term access even after the initial infection is cleaned.

The persistence mechanism runs on every Python interpreter startup via a .pth file in version 1.82.8. This makes it significantly more dangerous than version 1.82.7 — any Python script, test runner, or tool invoked in an environment where litellm is installed silently triggers the credential harvester in the background.

The Exfiltration Infrastructure

All collected data was encrypted using AES-256-CBC with a random session key, which was then encrypted with a hardcoded 4096-bit RSA public key. The encrypted archive was exfiltrated via HTTPS POST to models.litellm.cloud — a domain registered just hours before the attack and not part of legitimate LiteLLM infrastructure.

Only the attacker holds the RSA private key, which means only they can decrypt the stolen data. The encryption means we cannot know definitively what data was exfiltrated from compromised systems, but sources told BleepingComputer the number of data exfiltrations is approximately 500,000, with many being duplicates from the same systems.

The exfiltration domain choice is particularly insidious. models.litellm.cloud follows the naming convention of legitimate LiteLLM infrastructure (litellm.ai), making it difficult to distinguish in network traffic logs without close inspection. Healthcare security teams performing incident response should specifically search logs for POST requests to this domain.

What This Means for Healthcare

The LiteLLM compromise exposes multiple critical vulnerabilities in how healthcare organizations approach AI development security.

The Transitive Dependency Problem

Most affected organizations never explicitly installed LiteLLM. The package was pulled in automatically as a transitive dependency by AI agent frameworks, MCP servers, LLM orchestration tools, and development utilities. This makes discovery significantly harder — security teams can't simply search for litellm in their requirements files.

The healthcare angle: if your developers are building AI tools for clinical decision support, patient communication, medical coding, or administrative automation, they're almost certainly using frameworks that depend on LiteLLM. The question isn't "do we use LiteLLM" but rather "which of our AI development tools pulled it in as a dependency?"

Organizations need dependency tree analysis tools that map transitive dependencies and alert on high-risk packages in the supply chain. Tools like pip-audit, safety, or Snyk can identify vulnerable packages in the full dependency graph, not just direct dependencies.

CI/CD Pipelines as High-Value Targets

The most severe exposures occurred in CI/CD pipelines, which often hold the most privileged credentials: AWS deployment keys with production access, organization-wide API tokens for GitHub, Docker registry authentication, cloud provider service accounts with broad permissions, and database credentials for automated testing.

Healthcare CI/CD environments processing AI model deployments may have access to PHI for model training, FHIR server endpoints for integration testing, EHR API credentials for automated validation, and production deployment keys for updates to clinical systems.

Any CI/CD job that ran pip install or pip install --upgrade during the four-hour exposure window could have pulled the compromised litellm as a transitive dependency. Organizations should audit their CI/CD job logs for March 24, 2026, between 09:00-16:00 UTC to identify potentially affected builds.

The HIPAA Breach Question

If the malware executed in an environment with access to PHI or credentials that could reach PHI, this may constitute a HIPAA breach requiring notification. The key factors for determining breach status include whether API keys for LLM services processing PHI were present in the environment, whether cloud credentials with access to patient data storage were harvested, whether Kubernetes service accounts could reach services handling PHI, and whether exfiltrated .env files contained database connection strings to systems with patient information.

The challenge is determining what was actually exfiltrated versus what was accessible to the malware. The RSA encryption means only the attacker can decrypt the stolen data, so organizations cannot definitively know what was taken. Under HIPAA's risk assessment requirement, if the environment had access to PHI and the malware executed, the conservative assumption is that PHI access credentials were compromised.

The Vendor Risk Extension

Healthcare organizations that use third-party AI vendors for clinical decision support, medical coding, patient communication, or administrative automation should be asking those vendors whether they were affected by the LiteLLM compromise. Any vendor using Python-based AI development tools during the exposure window is a potential victim.

The supply chain extends beyond direct technical dependencies to vendor relationships. If a healthcare AI vendor was compromised, patient data processed by their systems may have been accessible through stolen credentials. Business Associate Agreements should specify incident notification requirements for supply chain compromises affecting vendor infrastructure.

Defense: What Actually Protected Organizations

Post-incident analysis revealed a clear pattern: organizations with specific defensive practices were completely protected, while those relying on default configurations were vulnerable.

Lockfiles With Hash Pinning

Organizations using poetry.lock, uv.lock, or pip-tools with --generate-hashes were completely protected. These lockfiles pin dependencies to exact versions with cryptographic hash verification, ensuring you install exactly the artifact you reviewed rather than whatever happens to be on PyPI at request time.

A requirement like litellm>=1.79.2 means "give me the latest version" — which during the attack window meant "give me the compromised version." A lockfile with litellm==1.82.6 and its SHA-256 hash means "only install this specific artifact" — immune to supply chain substitution.

Healthcare development teams should mandate hash-pinned lockfiles for all projects handling PHI or credentials that could reach PHI. The marginal effort of maintaining lockfiles is minimal compared to the incident response cost of a supply chain breach.

Kubernetes Service Account Token Restrictions

The lateral movement attack was completely dependent on finding Kubernetes service account tokens automatically mounted into pods. Organizations that set automountServiceAccountToken: false on pod specs blocked this attack vector entirely.

Most healthcare AI workloads don't need Kubernetes service account tokens. Model serving, data processing pipelines, and batch inference jobs typically authenticate to external services using application credentials, not the pod's service account. The default behavior of mounting these tokens is a security anti-pattern that provides attackers with cluster-wide access for free.

Kubescape's C-0034 control flags pods with automatically mounted service account tokens. Healthcare organizations running AI workloads in Kubernetes should audit for this configuration and disable service account token mounting except where explicitly required.

Scoped Secrets in CI/CD

Some organizations discovered that their GitHub Actions workflows had API keys defined as workflow-level environment variables, making them available to every step including pip install. The defensive pattern is scoping secrets to specific steps that need them.

Instead of defining secrets at workflow scope where they're visible to all steps, define them at step scope where only the deployment action can access them. This limits the blast radius when a supply chain attack compromises a build dependency.

The Response and Recovery

LiteLLM maintainers and PyPI administrators responded quickly once the compromise was discovered. PyPI quarantined the entire package within hours, removing the malicious versions and blocking new downloads. The LiteLLM team paused all new releases, conducted a full security audit of versions 1.78.0 through 1.82.6, verified that the main branch contained no malicious code, and rebuilt their CI/CD pipeline with isolated environments and stronger security gates.

Version 1.83.0 was released on March 25 using the new CI/CD v2 pipeline with enhanced security controls. The compromised versions (1.82.7 and 1.82.8) have been permanently removed from PyPI, and pip install litellm now resolves to version 1.82.6 (the last known-clean release before the attack) by default.

For affected organizations, the recovery checklist includes verifying whether versions 1.82.7 or 1.82.8 were installed by checking pip show litellm, inspecting package manager caches, and auditing CI/CD job logs for March 24, checking for persistence indicators including ~/.config/sysmon/sysmon.py, ~/.config/systemd/user/sysmon.service, and Kubernetes pods named node-setup-*, rotating all credentials accessible from affected environments including API keys, cloud credentials, database passwords, and SSH keys, and auditing Kubernetes cluster secrets for unauthorized access if lateral movement occurred.

The credential rotation requirement is particularly painful but critical. Incomplete credential rotation after a supply chain breach enables cascading attacks, as TeamPCP demonstrated by using stolen Trivy credentials to compromise LiteLLM three weeks later.

Looking Forward: This Campaign Isn't Over

Endor Labs' assessment concludes that this campaign is "almost certainly not over." TeamPCP has demonstrated a consistent pattern: each compromised environment yields credentials that unlock the next target. Over five days, they crossed five supply chain ecosystems: GitHub Actions, Docker Hub, npm, OpenVSX, and now PyPI.

The attacker group has shown capability to maintain persistent access over weeks, iterate on attack techniques based on what works, move laterally across completely different technology stacks, and use stolen credentials to unlock new targets in an expanding blast radius.

Healthcare organizations should assume that any credentials accessible from systems compromised in February-March 2026 during the Trivy or LiteLLM breaches may be used in future attacks. The window for rotating credentials is now, before those stolen credentials enable the next compromise in the chain.

The Bigger Picture: AI Supply Chain as Attack Surface

The LiteLLM compromise highlights a fundamental tension in healthcare AI development: the same packages that make AI development faster and more accessible also create concentrated points of failure in the supply chain.

LiteLLM has 95 million monthly downloads because it solves a real problem — unified access to multiple LLM providers without writing provider-specific integration code. But that central position in the ecosystem means a compromise affects nearly every AI development team simultaneously. The convenience of pip install litellm becomes a security liability when the package infrastructure is compromised.

For healthcare organizations building AI-powered clinical tools, administrative automation, or patient-facing services, this creates a risk management challenge. Teams need the velocity that open-source frameworks provide, but each dependency increases the attack surface. The solution isn't to avoid dependencies — that's impractical — but to implement defensive practices that limit blast radius when (not if) a compromise occurs.

Hash-pinned lockfiles, scoped secrets in CI/CD, disabled service account token mounting in Kubernetes, and aggressive credential rotation aren't new security practices. What's changed is the consequence of skipping them. In an era where supply chain attacks can compromise 95 million monthly downloads in a four-hour window, basic defensive hygiene isn't optional anymore — it's the minimum bar for operating in the healthcare AI supply chain.

This is entry #32 in the AI Security series. For related coverage, see UK Government Reality-Checks Claude Mythos: Why Healthcare's Cyber Basics Just Became Non-Negotiable.