Americas

Asia

Oceania

lconstantin
CSO Senior Writer

Npm ecosystem vulnerable to new manifest confusion attack

News Analysis
29 Jun 20236 mins
DevSecOpsOpen SourceVulnerabilities

Package manifests in the npm registry are not validated against metadata files in the package itself, leaving the door open for attackers.

The npm (Node Package Manager) ecosystem of JavaScript packages has a by-design bug that attackers could potentially exploit to hide malicious dependencies and scripts inside packages. The issue, dubbed manifest confusion, stems from the lack of consistency between manifest files that accompany archived packages and the JSON metadata file included in the package itself.

The issue was publicly disclosed this week by Darcy Clarke, a former staff engineering manager for the npm CLI team. Clarke left GitHub, which owns npm, in December, but he said GitHub has been aware of this issue since November, and he notified them again in March when, after independent research, he came to the conclusion that the impact is greater than originally thought.

According to Clarke, the general assumption in the community is that manifests published alongside a package on the npm registry match the contents of the package.json metadata file that’s included inside the package itself — the tarball archive downloaded from the repository. This is not true and client-side JavaScript package managers such as npm, but also security tools that scan packages from the npm repositories, do not properly validate these files against each other.

This means packages might have hidden dependencies or installation scripts listed in their package.json files but not in the separate manifest file. These dependencies and scripts will be parsed and executed by client-side JavaScript clients such as the npm command line interface (CLI) and others even though they’re not listed inside the package manifest.

“There are several ways this bug actually impacts consumers/end-users: Cache poisoning (i.e., the package that is saved may not match the name+version spec of that package in the registry/URI), installation of unknown/unlisted dependencies (tricking security/audit tools); execution of unknown/unlisted scripts (tricking security/audit tools); potential downgrade attack (where the version specification saved into projects is for a unspecified, vulnerable version of the package),” Clarke said.

Source-of-truth confusion

At its core, this issue is caused by the fact that there is not one clear “canonical source of truth” for the metadata for a package; things like name, version, dependencies, scripts, license and more. These are specified in the package.json file that is included in the package archive itself and supports integrity verification values like cryptographic hashes. However, some of the same data can be specified in the package manifest file when publishing it on the npm registry and this manifest dictates the information the registry will display.

For example, Clarke created an example package whose package.json file listed another package as a dependency, but when he published it he didn’t include the dependency in the manifest. As a result, the entry of the package on the npm.js repository lists the package with 0 dependencies, because the registry uses the manifest as the canonical source of truth. However, the registry itself doesn’t actually validate that the package.json information matches the manifest information. That task is left to the client installing the package. As it turns out, the clients don’t really perform this validation either.

For example, npm version 6 (npm@6), which shipped with the Node.js runtime version 14 (long-term support), will execute an install script defined in the package.json even if the script is not defined in the manifest. A listed dependency in package.json that is missing from the manifest will not be deployed the first time the package is downloaded and installed. However, if that package is cached locally and later installed again from the local source with the –prefer-offline and the –no-package-lock command line options, the hidden dependencies from package.json will be installed.

Npm version 9 (npm@9), the current stable version of npm, will similarly install dependencies referenced inside a cached package’s package.json when using the –offline config.

The yarn and pnpm package managers that are alternatives to npm are also vulnerable and will execute scripts referenced in the package.json file that are absent from the manifest. Yarn will also prefer the package version defined in package.json over the one in the manifest. Because these two values can be different, it opens the door to a downgrade attack.

Downgrade attacks are dangerous because a package can be replaced with an older version that has a known vulnerability. There’s no shortage of package versions with vulnerabilities, even in the actively maintained projects. Last week researchers from Snyk and Redhunt Labs released the findings of a research project that involved scanning more than 11,000 repositories belonging to the top 1,000 organizations on GitHub. The scan looked for vulnerabilities in the dependencies listed in those projects that spanned multiple programming languages. For JavaScript (npm and yarn), the team extracted 1.9 million dependencies and identified around 550,000 instances of known vulnerabilities in them.

Clarke thinks this issue falls under different vulnerability categories, but at the very least CWE-602 Client-Side Enforcement of Server-Side Security. He notes that “there is a history of relying heavily on the client (aka the npm CLI) to do work that should be done server-side.”

Aside from the aforementioned client-side package managers, the issue also impacts other third-party tools and package registries, including security-focused ones: Snyk, the Chinese NPM Mirror, the CloudFlare npm CDN mirror, the UNPKG CDN mirror, Skypack, JSPM, and even local repositories created with jFrog’s Artifactory.

No easy fix for manifest confusion vulnerability

Fixing this issue and suddenly enforcing validation is not straightforward and might take a while until GitHub comes up with a solution because there are likely many packages that have this manifest confusion and not for malicious reasons. Clarke noted that the npm CLI itself causes such inconsistencies, too. For example, when publishing a package through the npm CLI where a binding.gyp file is located inside the project, the client will add an entry to the manifest file called: “node-gyp rebuild” scripts.install. This entry will not be present in the package.json file.

“GitHub is understandably in a tough spot,” Clarke said. “The fact that npmjs.com has functioned this way for over a decade means that the current state is pretty much codified and likely to break someone in a unique way. As mentioned before, the npm CLI itself relies on this behavior and there are potentially other non-nefarious uses of this in the wild today.”

Users should contact any known authors of tools that rely on npm and ask them to rely on package.json information rather than the manifest, except for the version and name which could different for legitimate reasons. Another option would be to use a proxy between the client and the registry that strictly validates the metadata from both sources for consistency.