npm – Catching Up with Package Lockfile Changes in v7

Published on February 8, 20215 min read

Introducing the changes that were done in the seventh version of npm for better performance while allowing deterministic and reproducible builds, focusing on the new package-lock.json format (v2) and Yarn’s lockfile support.

The seventh version of npm is already published and arrives with several highly requested features including workspaces, automatically installing peer dependencies and package lockfile improvements.

In this article, we’re going to focus on the changes that were done in regards to the package lockfiles.

The content is available as a video as well:

Here we begin.

Motivation

Reproducible builds is an approach ensuring that the same source code, build environment and build instructions produce the exact copies of all specified artifacts verified with a bit-by-bit comparison. This means, a given source code must create the same result deterministically, the build tools should be predefined and the build process should validate that the output matches the original.

In terms of npm (and Yarn) – reproducible builds guarantee that all teammates will get the precise versions of all dependencies even though working on different machines, and so is the production environment. This is possible because that these CLI tools manage "lock" files (that are designed to be committed obviously) instructing them how to produce the precise node_modules tree.

The truth is that reproducible builds (and package-lock.json specifically) aren’t new and already implemented since npm v5. So, the question remains, what actually was changed? 🤔

Well, this what is we’re going to explain.

New Lockfile Format

npm v7 arrives with a newer version for the package-lock.json format – allowing to reduce the need to read package.json files and to have enough information to reliably describe the full and precise package tree all by itself. More than that, the resulting package tree using the new lockfile is flattened, and this is crucial to boost the performance.

Practically this means that starting on v7 the file is generated with a new set of semantics:

Comparing lockfile v2 with v1
Comparing lockfile v2 with v1

On the left there is a lockfile generated with npm v7 after installing React whereas on the right is the one generated with v6.

First of all, the lockfileVersion field is an integer pointing which schematics version were used to generate the file. So, in case of npm v7, the schematics version is 2 which belongs to the new lockfile format. Important to note that lockfiles in v2 are backwards compatible with CLI versions supporting v1 lockfiles (for example, npm v5 & v6).

Secondly, a field called packages was added which maps each installed package by its location to an object containing all needed information about this specific package. Of course, fields such as resolved, integrity, link are still needed and contained. Though, the main change is that with v2 the information is mapped to the package relative location and not just the package name (as done in v1). Notice that the root project is listed and represented with a key of "" – and then, all dependencies are listed with their relative paths to that root directory.

Thirdly, as said, the new lockfile is backward compatible so the legacy dependencies field is still contained. This field by the way had been used to map the package information to the name, and it takes up a position for lower CLI versions that don’t recognize the new packages field.

And now we can say how the reproducible builds are actually expressed – the lockfile is created in advance and committed to the source control. This file contains the resolved package deterministically by a URL to a tarball, while also including the integrity of the relative unpacking location. Put simply, the lockfile v2 is sophisticated enough to solely allow deterministic and reproducible builds – without additional gathering information from package.json. 💪🏻

Yarn’s Lockfile Support

So far the yarn.lock files were completely ignored by npm’s CLI, but the good news is, as of v7 – if these files are available, they will be used as a source of package metadata and resolution guidance.

In practice, the resolved values that are contained inside Yarn’s lockfile will clearly instruct the CLI where to fetch packages from, whereas integrity keeps being used to verify that the artifact matches. On top of that, the yarn.lock file will be handled as well when installing or removing packages using npm’s CLI.

Note that when the package-lock.json file exists it’s being used as the authoritative definition of the resulting package tree. The yarn.lock file is supported mainly to provide better interoperability between npm and Yarn, in order to accomplish missing information if necessary.

Another question arises – why doesn’t npm just rely on the yarn.lock without managing a lockfile of its own? Actually this is explained in detail within the npm’s official blog but let’s list the reasons in a nutshell:

  • Yarn guarantees resolutions by given a single combination of yarn.lock file and specific CLI version – which means, different Yarn versions can produce different results of node_modules tree. In contrast, npm differentiates between deterministic resolutions of dependencies and deterministic tree package shape of dependencies.
  • Yarn produces in some cases a tree with excessive duplication using its lockfile, which doesn’t allow npm to optimize the resulting tree.
  • Locking down the resulting package tree shape inside the lockfile, allows npm to support features such as --prefer-dedupe without breaking the ability to produce deterministic reproducible builds.
  • Yarn (and npm v5/v6) is assisted by the package.json to build the package tree, compared to npm v7 that merely needs package-lock.json generated by schematics v2.

So, to answer the question, the current implementation of Yarn’s lockfile doesn’t have enough information needed for the complete npm functionality.

Hidden Lockfile

As of v7, npm places a hidden lockfile inside node_modules containing all information about the package tree:

A hidden lockfile
A hidden lockfile

The purpose of this file is to avoid reading the entire tree repeatedly. In fact, it’s relevant only when created at the time of the most recent update of the package tree. In other words, if different CLIs modify the tree – the contained references might not be relevant thereby in this case the hidden lockfile is ignored.

Note that the hidden lockfile is ignored by npm v5/v6, since its lockfileVersion field is 3 – which indicates non-backward compatibility (a.k.a breaking change) with older CLI versions. This schematic version would entirely be used in the future as soon as npm v6 support ends.

Performance

We already mentioned that using the new lockfile the CLI produces a flattened resulting package tree. We also said that that lockfile contains all the necessary information, and hereby makes the reads from package.json redundant. On top of that, it helps to accelerates the fund command that allows to retrieve the funding information. And, obviously the hidden lockfile, that might avoid the repeated tree reading.

Well, those together lead to significant improvements in the performance:

Benchmarks
Benchmarks

This benchmarks chart is directly taken from the recent CLI’s benchmark tooling. We can clearly notice that npm v7 is lower (which is faster) compared to v6 in most of the tests.

Summary

We introduced today the lockfile changes that were done in npm v7 which continues to ensure deterministic reproducible builds along with performance improvements.

Let’s recap:

  • Reproducible builds ensure producing the exact artifacts verified with a bit-by-bit comparison – from the same source code, build environments and build instructions
  • npm’s CLI v5/v6 (and Yarn) strives to guarantee deterministic reproducible builds using a lockfile resolving the precise package tree
  • npm’s CLI v7 arrives with schematics generating a newer version of the lockfile format (v2)
  • The v2 lockfile has enough information to describe the precise package tree all by itself
  • The v2 lockfile maps the packages to their information by their relative location to the root (instead of their name)
  • The v2 lockfile is backward compatible with CLIs in v5/v6
  • npm’s CLI v7 uses yarn.lock files if available, as a source of package metadata and resolution guidance when there is missing information, knowing that the package-lock.json is the authoritative definition
  • yarn.lock cannot completely replace npm’s lockfile since the current implementation doesn’t have enough information needed for the complete npm functionality
  • npm’s CLI v7 arrives with better performance because of:
    • The new lock format helping to avoid reading from package.json by the CLI
    • The new lockfile format helping to produce a flattened package tree
    • The new lockfile format helping to accelerate the fund command
    • The hidden lockfile placed inside node_module helping to avoid repeated package tree reading

Here’s the example project:

Follow Me

Join My Newsletter

Get updates and insights directly to your inbox.

Site Navigation


© 2024, Nitay Neeman. All rights reserved.

Licensed under CC BY 4.0. Sharing and adapting this work is permitted only with proper attribution.