Your AI Pair Programmer Is Your Newest Supply Chain Risk—Slopsquatting, Hallucinated Packages, and the LiteLLM Wake-Up Call

I had a moment last week that genuinely unsettled me.

I was building a new accessibility audit component for a side project. Asked my AI coding assistant to help wire up some SVG parsing logic. It confidently suggested I install a package called svg-accessibility-parser. Looked legit—reasonable name, plausible API. I was halfway through npm install when something felt off. I checked npm. The package does not exist.

My AI hallucinated a dependency name. And I almost installed it without thinking.

This Has a Name Now: Slopsquatting

Turns out researchers have been studying this exact pattern. It is called slopsquatting—a cousin of typosquatting, but instead of betting on human typos, attackers bet on machine hallucinations.

Here is how it works:

  1. LLMs predict statistically likely next tokens, so when you ask for help, they sometimes suggest package names that sound right but do not actually exist
  2. A USENIX study found that roughly 20% of AI-generated code samples reference non-existent packages, and 43% of hallucinated names are reproduced consistently—meaning they are predictable
  3. Attackers study which fake names appear frequently, register them on npm/PyPI, and wait

This is not theoretical. A security researcher at Lasso Security documented that AI models repeatedly hallucinated a Python package called huggingface-cli. He registered it as an empty package on PyPI. 30,000+ authentic downloads in three months. Another hallucinated package—react-codeshift—spread across 237 GitHub repositories through AI-generated agent skills without a human ever reviewing the install command.

The LiteLLM Attack Made It Real

If slopsquatting is the slow-burn threat, the LiteLLM supply chain attack from March 2026 was the five-alarm fire.

Quick recap for anyone who missed it:

  • LiteLLM is an AI proxy library downloaded roughly 3.4 million times per day
  • Threat actors (TeamPCP) compromised LiteLLM’s CI/CD pipeline by poisoning a Trivy GitHub Action used in their security scanning workflow
  • They exfiltrated the PyPI publish token from the GitHub Actions runner
  • Published two backdoored versions (1.82.7 and 1.82.8) that were live for about 40 minutes
  • The malicious payload (a .pth file that executes on every Python process startup) could exfiltrate SSL/SSH keys, cloud credentials, Kubernetes configs, crypto wallets, API keys—basically everything
  • Over 40,000 downloads of the compromised version before PyPI quarantined it

The irony? They compromised the security scanner to compromise the package. And LiteLLM is present in 36% of cloud environments.

AI Agents Make This Worse

Here is what really worries me. Research analyzing 117,000+ dependency changes across thousands of GitHub repos found that AI agents choose versions with known CVEs 50% more often than humans. And the vulnerable versions they pick tend to require larger, more disruptive upgrades to fix.

Now combine that with autonomous coding agents that install dependencies, run builds, and open PRs without human involvement. You have got software that:

  • Hallucinates package names 20% of the time
  • Picks vulnerable versions when the package does exist
  • Operates with enough permissions to execute arbitrary code

And we are handing it commit access.

What I Changed After My Scare

I am not going to pretend I have this figured out. But after my svg-accessibility-parser moment, I made a few changes:

  1. I verify every AI-suggested dependency manually. Yes, every single one. I open the npm/PyPI page, check the repo, check the download count, check the last publish date
  2. I added a lockfile diff review step to our team’s PR process—any new dependency addition gets flagged
  3. I pinned all our GitHub Actions to specific commit SHAs instead of tags (the LiteLLM attack exploited unpinned Trivy)
  4. I run npm audit / pip audit as a blocking CI step, not just an informational one

But honestly? I still feel like we are applying band-aids to a structural problem. The tools we use to write code are now introducing attack surface that our existing security processes were not designed for.

Questions I Am Sitting With

  • How are your teams handling AI-suggested dependencies? Is anyone doing systematic verification?
  • Should package registries (npm, PyPI) build hallucination-aware defenses? Like flagging recently-registered packages that match known hallucination patterns?
  • Are we going to need a fundamentally different dependency governance model for AI-assisted development?
  • For those using autonomous coding agents (Cursor, Devin, etc.)—what guardrails do you have on dependency installation?

I keep thinking about how my design systems work depends on dozens of npm packages. If even one of them gets compromised through a supply chain attack or a slopsquatting registration… the blast radius is not just my side project. It is every product team consuming our component library.

Would love to hear how others are thinking about this. Especially from folks managing larger engineering orgs—is this on your risk radar yet?

Maya, thank you for writing this up. This is exactly the kind of concrete, experience-driven post that makes abstract security risks tangible.

I want to address your question about whether this is on leadership’s risk radar. At my organization, it absolutely is now—but it was not six months ago, and I suspect most engineering orgs are still in that earlier phase.

The Governance Gap Is Real

The fundamental issue is that our dependency governance models were designed for a world where humans chose every package. The assumption was: a developer evaluates a library, checks its maintenance status, maybe reviews the source, and makes a deliberate decision. Code review catches anything unusual.

AI-assisted development breaks every one of those assumptions. The “evaluation” is delegated to a model that has no concept of trustworthiness. The “review” happens on generated code where reviewers are already primed to trust the output because the tests pass.

What We Implemented

After the LiteLLM incident, we did a rapid security review and implemented three organizational changes:

1. Approved dependency allowlists. We maintain a curated list of pre-approved packages for each language ecosystem. Any dependency not on the list requires a security review before merge. This is not new—many enterprises do this—but the AI era makes it non-negotiable rather than nice-to-have.

2. CI pipeline verification for AI-generated PRs. We tag PRs that originate from AI coding agents and route them through an additional verification step that specifically checks for new dependencies, version changes, and any modifications to CI/CD configuration files. The LiteLLM attack vector (compromising CI to compromise packages) was a wake-up call about how much trust we place in our build pipeline.

3. Quarterly dependency audits with AI-specific criteria. Beyond the standard vulnerability scanning, we now audit for: packages added in the last 90 days with low download counts, packages whose names closely resemble popular libraries, and any dependency pulled in by AI tooling that was not in our previous baseline.

The Harder Conversation

The uncomfortable truth is that these measures slow things down. And in a competitive market where engineering velocity is a strategic advantage, that creates real tension. I have had direct conversations with my CEO about why our deployment frequency might decrease slightly as we implement these controls.

My position: the cost of a supply chain breach—regulatory exposure, customer trust erosion, remediation effort—dwarfs the cost of slightly slower shipping. The LiteLLM attack affected 36% of cloud environments. Imagine if that payload had been a ransomware dropper instead of a credential stealer.

But I also recognize that “slow down and verify everything” is not a sustainable answer at scale. We need better tooling, better registry defenses, and better AI models that understand the concept of package provenance. Until then, organizational discipline is what we have.

This thread is hitting close to home. We are in financial services, so supply chain security is not just an engineering concern—it is a compliance and regulatory one.

When the LiteLLM news broke, our CISO walked into my office and asked me a question I did not have a good answer for: “How many of our AI-assisted PRs introduced new dependencies in the last quarter, and did any of them bypass our standard review?” We did not know. We had no way to distinguish AI-assisted dependency additions from human ones.

The Financial Services Angle

In our world, every third-party dependency is a potential audit finding. We are subject to OCC guidance on third-party risk management, and our examiners are starting to ask questions about AI-assisted development specifically. The regulatory framework has not caught up yet, but the questions are getting sharper.

What keeps me up at night is not a single compromised package. It is the cascade effect Maya described. A poisoned dependency in a shared internal library propagates across every service that consumes it. In a financial services context, that could mean a credential stealer running inside systems that process payment transactions. The blast radius is not a PR revert—it is a potential data breach notification to regulators and customers.

What We Do Differently (and What Still Scares Me)

We already had strict dependency management before AI tools entered the picture:

  • All dependencies must come from an internal artifact repository (we mirror npm/PyPI and scan before allowing packages through)
  • New dependency requests go through a security architecture review
  • We use Dependabot with aggressive auto-merge disabled

But @cto_michelle’s point about AI-generated PRs needing different review criteria is spot on. We added a rule: any PR that adds a new dependency must include a comment explaining why that specific package was chosen and what alternatives were considered. If the developer cannot articulate that, it is a signal that the choice was AI-delegated without validation.

The part that still scares me: our junior engineers are the heaviest AI tool users, and they have the least experience evaluating whether a package is trustworthy. The 50% higher CVE rate for AI-selected versions Maya cited does not surprise me at all. The model optimizes for “works” not for “safe.”

One Practical Suggestion

For teams not ready for full allowlisting (which is a significant operational investment), I would suggest starting with a “new dependency” Slack notification. We built a simple GitHub Action that posts to a security channel whenever a lockfile changes. Low effort, high visibility. It does not block anything, but it creates social accountability—people know their dependency choices are being observed.

Would be curious to hear from @product_david on this too. From a product perspective, how do you balance the velocity gains from AI coding tools against the security overhead they create? Because in my experience, that tension is where most organizations stall out.

@eng_director_luis asking the right question. Let me give you the honest product perspective, because I think there is a disconnect happening in many organizations right now.

The Velocity Pressure Is Not Going Away

I will be direct: when I am evaluating AI coding tools for our engineering team, supply chain security is not the first thing I think about. My board asks about time-to-market. My customers ask about feature delivery. My competitors are shipping faster because they adopted AI tooling earlier.

That is not an excuse—it is the reality of how product decisions get made. And if security teams want engineering orgs to take this seriously, they need to frame it in terms that resonate at the executive level.

Here is what resonates: the LiteLLM attack affected 40,000 installations in 40 minutes. If one of those installations was our production environment, that is not a security incident—it is a business continuity event. Customer data exposed. Trust destroyed. Revenue impact measured in months, not days.

Where I Have Changed My Thinking

I used to push back on anything that slowed our deployment pipeline. After reading about the LiteLLM compromise chain (attacker poisons security tool → steals CI credentials → publishes malicious package → executes on install), I realized something: the attack surface has moved upstream of where most product teams think about risk.

We think about production security—WAFs, encryption, access controls. But the new attack surface is the development environment itself. The CI pipeline. The package manager. The AI assistant suggesting code before a human even evaluates it.

So I have started including “development supply chain risk” as a line item in our product risk assessments. It sits alongside market risk and competitive risk. That framing has been effective at getting budget allocated for the kind of tooling @cto_michelle described.

The Uncomfortable Tradeoff

But I want to push back slightly on the “slow down” narrative. The answer cannot be “verify everything manually” because that scales linearly with the number of AI-assisted contributions. And the whole point of AI tooling is to scale output beyond what manual processes can handle.

What we actually need:

  1. Automated verification that runs at AI speed, not human speed. If the AI suggests a package, an automated system should immediately check: does it exist? What is its provenance? Does it have known vulnerabilities? This should happen in the IDE, before the code even reaches a PR.
  2. Product decisions about acceptable risk. Not every project needs the same level of dependency governance. An internal tool has different risk tolerance than a payment processing service. We should be explicit about those tiers rather than applying one-size-fits-all controls.
  3. Registry-level defenses as Maya suggested. npm and PyPI should be building hallucination detection. If 43% of hallucinated names are predictable and consistent, the registries have a data-driven opportunity to flag or block suspicious registrations proactively.

I do not think we solve this with process alone. We need the ecosystem to evolve. But in the meantime, the lockfile-change notification that Luis described is a great low-friction starting point.

This is an incredible thread. I want to bring in the people and process angle because I think we are dancing around something important: this is fundamentally a training and culture problem, not just a tooling problem.

The Junior Engineer Amplification Effect

@eng_director_luis touched on this and I want to expand on it. At my org, I am scaling an engineering team from 25 to 80+. Many of our newer hires have never worked in a world without AI coding assistants. For them, npm install whatever-the-AI-said is the default workflow. They do not have the muscle memory of manually researching packages because they never had to develop it.

This is not a criticism of junior engineers. It is a failure of how we onboard and train. We teach people how to use the AI tools but we do not teach them how to be skeptical of the AI tools. We celebrate velocity without building the instincts for verification.

I ran an informal experiment last month. I asked five of our newer engineers to walk me through how they evaluate a dependency suggestion from their AI assistant. Three of them said they check if the tests pass after installation. One said they look at the import and see if the IDE throws errors. Only one mentioned checking the package’s npm page or GitHub repo.

Four out of five were evaluating trustworthiness based on whether the code ran, not whether the package was legitimate. That is exactly the gap that slopsquatting exploits.

What I Am Building Into Our Engineering Culture

Instead of just adding process gates (which people will route around if they do not understand why), we are investing in education:

1. “Dependency Day” during onboarding. Every new engineer spends a half-day learning about supply chain attacks, reviewing real incidents (LiteLLM, the event-stream compromise, the ua-parser-js hijack), and practicing dependency evaluation. They literally walk through the exercise of verifying an AI-suggested package.

2. Blameless dependency postmortems. When we catch an unnecessary or risky dependency in review, we do not just block it—we use it as a teaching moment. The engineer who suggested it walks the team through what happened and what they would do differently. No blame, just learning.

3. AI skepticism as a promotion criterion. This is new and maybe controversial: we added “demonstrates appropriate skepticism of AI-generated outputs” to our engineering competency matrix. If you want to move from mid-level to senior, you need to show that you can critically evaluate AI suggestions, not just consume them.

The Systemic Issue

@product_david is right that automated verification at AI speed is the long-term answer. But I want to name something: the reason we are in this position is that AI coding tools were adopted for their productivity benefits without a corresponding investment in the security implications. The tools shipped without guardrails and organizations adopted them without updating their threat models.

This is not unique to AI—it happens with every new technology. But the speed and scale of AI adoption means the gap between capability and governance is wider than anything I have seen in 16 years in this industry.

Maya’s original question—“is this on your risk radar?”—is the right question. And the honest answer for most organizations is probably: it is now, but it was not three months ago, and we are still figuring out what to do about it.