GitHub Fights Forks — Millions of Them — Huge Software Supply Chain Security FAIL
2024-3-1 00:37:11 Author: securityboulevard.com(查看原文) 阅读量:7 收藏

A fork, wrapped in delicious pastaScrotebots clone thousands of projects, injecting malware millions of times.

GitHub is under attack from malicious forks—clones of established software repositories with injected Trojans. Devs have no clue which repo to trust. And this has been going on for nine months.

Researchers are calling it “repo confusion.” In today’s SB Blogwatch, we stick it to the repo man.

Your humble blogwatcher curated these bloggy bits for your entertainment. Not to mention: Big egg.

Forking Hell

What’s the craic? Dan Goodin reports—“GitHub besieged by millions of malicious repositories”:

Sheer number
GitHub is struggling to contain an ongoing attack that’s flooding the site with … obfuscated malware that steals passwords and cryptocurrency from developer devices. … An unknown party has automated a process that forks legitimate repositories. … The result is millions of forks with names identical to the original one that add a payload that’s wrapped under seven layers of obfuscation.

Supply-chain attacks … have existed since at least 2016, when a college student uploaded custom scripts to RubyGems, PyPi, and NPM. The scripts bore names similar to widely used legitimate packages, but otherwise had no connection to them. … The imposter code was executed more than 45,000 times on more than 17,000 separate domains, and more than half the time his code was given all-powerful administrative rights. … In 2021, a researcher used a similar technique to successfully execute counterfeit code on networks belonging to Apple, Microsoft, Tesla, and dozens of other companies.

The campaign began last May and [is] ongoing. … Given the sheer number of forks and the sustained duration of the campaign, developers would do well to be aware of the risk and ensure downloads come from legitimate sources.

Horse’s mouth? Matan Giladi and Gil David—“Over 100,000 Infected Repos Found on GitHub”:

Massive and lucrative
Similar to dependency confusion attacks, malicious actors get their target to download their malicious version instead of the real one. But dependency confusion attacks take advantage of how package managers work, while repo confusion attacks simply rely on humans to mistakenly pick the malicious version over the real one, sometimes employing social engineering techniques.

Most of the forked repos are quickly removed by GitHub, which identifies the automation. However, the automation detection seems to miss many repos, and [some] survive. Because the whole attack chain seems to be mostly automated on a large scale, the 1% that survive still amount to thousands of malicious repos. … Because of the operation’s large scope, this campaign has a sort of 2nd-order … network effect when … naive users fork the malicious repos without realizing they are spreading malware.

This campaign, along with dependency confusion campaigns plaguing package registries and generally malicious code being spread through source control managers, demonstrates how fragile software supply chain security is. … The supply chain remains a massive and lucrative attack surface for malicious actors.

What a mess. coofercat emits a humble opinion:

Forks are (IMHO) one of the biggest weaknesses of Git[hub|lab|other]. The ability to fork is obviously cool, but there’s no way to know if the fork is any good.

You may well find a genuinely good, but perhaps abandoned or otherwise out of date project, perhaps via Google or other means. In some sense then the ‘reputation’ of the repo has been established, and so you might have some level of trust for it. However, the forks are a complete unknown – it’s not immediately obvious if all the changes in the fork have been merged into the main project, it’s not very easy to see what’s changed in the fork, or what the purpose of it is, or anything else.

If the source project in question is itself a fork of another, then this does present a problem (as it’ll be hard to know which fork is the ‘good’ one). In most cases though, top of the fork tree is probably the ‘good’ one and everything below not so much. If the forks are actually checkout and re-commit rather than real ‘fork’, then you’ve got problems identifying anything about anything. Then it’s over to GitHub to fix it.

Have you bought the T-shirt yet? dspillett has been there; done that:

Our regular reminder to be careful what you pull from public repositories and other sources. And to verify your dependency trees.

If malware is massively prolific in public repos, how much does this affect LLMs and other automation tools that are trained using the contents of such resources? What are the chances that we’ll see copilot & friends occasionally emit malware in response to coding questions that generate responses long enough for accidentally malicious parts to hide among?

AI as malware vector? Yikes. Dmytry agrees:

This kind of thing also has the potential of getting amplified by copilot, if it ends up training on commits that insert vulnerabilities and backdoors. Given that it costs a lot of money to train a neural network it can be costly to re-train … if it is polluted.

Alternatively, AI might be the solution to the problem. So thinks CrazyCartwheels:

AI to the rescue? This next copilot suggestion might get real.

n00bs! n00bs everywhere! silvestrov remembers the Eternal September:

Github is failing the same way usenet failed: Everybody could post stuff to usenet just like everybody can create a github repository and there is nothing that sets an official repository apart from a spammer’s.

When Amazon has “the everything store” as main strategic goal, they get hit by “90% of everything is junk.” So, they end up being a store of mostly junk. Github should figure out if their product is “a repository for everybody,” or if it is, “I can trust this code.”

Is this surprising? Christarp isn’t surprised:

I’m surprised this attack vector wasn’t being massively exploited years and years ago, tbh. The potential … to get into massive corporate systems has been a low hanging fruit here for a while now.

Meanwhile, a spittle-flecked rstanley wears out their “!” key:

We need a new one, not owned by Mickey$oft! GitHub should NEVER have been sold to MS in the first place! I would never post any code on GitHub!

And Finally:

Storytelling masterclass

Hat tip: simbosan

Previously in And Finally


You have been reading SB Blogwatch by Richi Jennings. Richi curates the best bloggy bits, finest forums, and weirdest websites … so you don’t have to. Hate mail may be directed to @RiCHi, @richij or [email protected]. Ask your doctor before reading. Your mileage may vary. Past performance is no guarantee of future results. Do not stare into laser with remaining eye. E&OE. 30.

Image sauce: Mae Mu (via Unsplash; leveled and cropped)

Recent Articles By Author

Richi Jennings , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Richi Jennings

Richi Jennings is a foolish independent industry analyst, editor, and content strategist. A former developer and marketer, he’s also written or edited for Computerworld, Microsoft, Cisco, Micro Focus, HashiCorp, Ferris Research, Osterman Research, Orthogonal Thinking, Native Trust, Elgan Media, Petri, Cyren, Agari, Webroot, HP, HPE, NetApp on Forbes and CIO.com. Bizarrely, his ridiculous work has even won awards from the American Society of Business Publication Editors, ABM/Jesse H. Neal, and B2B Magazine.

richi has 582 posts and counting.See all posts by richi


文章来源: https://securityboulevard.com/2024/02/github-repo-confusion-supply-chain-richixbw/
如有侵权请联系:admin#unsafe.sh