The Leopard's Head
On May 19, one stolen login pushed 637 poisoned software versions into the world's computers in twenty-two minutes. A London guild solved this exact problem in the year 1300, and the answer was not 'trust the silversmith.'
On May 19, somebody logged into a single npm account and, over the next twenty-two minutes, published 637 malicious versions across 317 software packages. I wish the attack had been a least a little bit interesting, but it wasn't.
They logged in with valid credentials, the registry said welcome back, and an automated script did the rest. The poisoned packages included echarts-for-react, a charting wrapper that pulls well over a million downloads a week, along with a pile of the @antv data-visualization libraries that sit quietly underneath dashboards at companies that have never once heard the name AntV. The payload was a 498-kilobyte obfuscated script that went looking for everything worth stealing: AWS keys, Kubernetes service-account tokens, GitHub tokens, npm tokens, SSH keys, and the local vaults of 1Password and Bitwarden. If your project carried "echarts-for-react": "^3.0.6" in its package.json, that innocent little caret resolved you to the malicious 3.2.7 on the next clean install. You did not have to do anything wrong; you only had to have done everything normal.
This one is called Mini Shai-Hulud, and it is the small, fast cousin of the Shai-Hulud worm that tore through npm last September, the self-replicating one that used each maintainer's stolen token to poison the next maintainer's packages and backdoored hundreds of them before anyone could react. The mechanism never changes; it's always one account with all the trust, and the account belongs to a person.
Everybody in software has seen the xkcd. All of modern digital infrastructure drawn as a teetering tower of blocks, the whole thing balanced on one tiny load-bearing piece labeled "a project some random person in Nebraska has been thanklessly maintaining since 2003." We laughed because it was true, but it stopped being funny the moment somebody noticed that the person in Nebraska also has an npm token, that the token is the actual load-bearing piece, where if you can just phish the human holding it, you win. We built a trillion-dollar industry on a trust model that reduces, when you say it plainly, to "the package is fine because Dave uploaded it and Dave seems nice."
There are two popular responses to this. The first says the answer is memory-safe languages, rewrite the world in Rust, and a lot of that is genuinely good engineering, but it is repairing the wrong floor of the building. No amount of memory safety protects you from a process that had the password. The second response is more sophisticated and much closer to right: provenance, signing, software bills of materials, attestation, the whole supply-chain-security apparatus. Verify what you install instead of trusting where it came from. On paper this seems great!
The problem comes from who holds the stamp. Most of these schemes end up living as a feature of the same registry that earns its numbers by making publishing as frictionless as possible, which means the body certifying the package and the body that profits from a flood of packages are the same body. That is not verification; that is self-attestation with extra steps.
The year 1300
A silver spoon has exactly the same trust problem as an npm package. You cannot tell by looking whether it is sterling or whether the maker quietly cut the silver with something cheaper, and by the time you find out, the maker is three towns away and so is your money. The medieval answer was not "trust the silversmith." It was also, and this is the part we keep skipping, not "make all the silver in one royal workshop." In 1300, a statute of Edward I required that every article of silver meet the sterling standard, 92.5 percent, and be tested by independent guardians of the craft who struck it with a leopard's head if it passed. In 1363 they added the maker's mark, so the object carried the identity of who made it, permanently, stamped into the metal. By 1478 the testing was consolidated at Goldsmiths' Hall in London, which is where the word hallmark comes from. The mark struck at the hall.
In a lot of ways, this is the same problem (and solution) to what we have today. The object carries its own provenance, struck into it, so the proof travels with the thing and not with some database you have to phone at install time. The assayer is independent of both the maker and the seller, and is paid to be right rather than to move volume. And the standard is a published number, 92.5, not a vibe about whether the silversmith seems trustworthy. Seven hundred years ago, a guild of fiercely competing London metalworkers agreed to submit to an outside examiner with a stamp, because every honest maker understood that a market where buyers cannot verify quality is a market that eventually charges everyone the fraud discount. Daniel Stenberg, who has maintained curl for more than twenty-five years and has watched more of this go wrong than almost anyone alive, said it this month: the industry has to move from trust to verification. He is describing the leopard's head, and we have just not struck it yet.
In our case, the code lives in a million repositories on a thousand machines and is totally decentralized. But the trust lives in one account protected by one password belonging to one tired volunteer, which is about as centralized as a thing can get. We spent a decade congratulating ourselves on the first fact and ignoring the second, and Mini Shai-Hulud is what the second fact looks like when somebody finally reads it back to us at machine speed.
I wrote last week about cloud egress and the company store, about the man behind the counter who shrugs when the price of flour goes up because nobody in the conversation is allowed to be responsible for it. The package registry is the same counter. When the poisoned version lands in your build, you call your vendor, who points at the dependency, which points at the maintainer, who points at the phishing email, and everyone is technically blameless while your AWS keys are already in somebody else's terminal. Until the software commons has a leopard's head, struck by somebody who does not get paid by the package, "supply chain security" is going to keep being a man behind a counter, shrugging.
The Goldsmiths' Company figured this out before England had a central bank. Maybe we send them a résumé.
Want to learn how intelligent data pipelines can reduce your AI costs? Check out Expanso. Or don't. Who am I to tell you what to do.
NOTE: I'm currently writing a book based on what I have seen about the real-world challenges of data preparation for machine learning, focusing on operational, compliance, and cost. I'd love to hear your thoughts!