Content-addressable store
How PCPM saves 70-90% of disk on multi-project solutions.
The content-addressable store is the part of PCPM that does the heavy lifting. It’s the reason a 30-project monorepo fits in a few hundred megabytes instead of tens of gigabytes.
What “content-addressable” means
A content-addressable store is a directory layout where the path of a file is derived from its contents. The usual recipe is:
- Compute a hash of the file’s bytes —
sha256, in PCPM’s case. - Store the file at
<root>/<first-2-chars>/<remaining-chars>/<filename>. - To look up a file, hash the bytes you have, derive the path, and read it.
The property you get for free is deduplication: if two files have the same bytes, they have the same hash, and they have the same path. The second one doesn’t need a separate copy on disk.
PCPM uses this for .nupkg files (immutable archives) and the
extracted package contents (lib/, runtimes/, etc.). The store path
on Windows is %LOCALAPPDATA%\pcpm\store; on Linux and macOS it’s
~/.local/share/pcpm/store.
The on-disk layout
%LOCALAPPDATA%\pcpm\store\
v1\
8f\ # first 2 chars of the sha256
8f3a4b5c6d7e…\ # remaining chars of the sha256
pkg.nupkg # the immutable archive
extracted\
newtonsoft.json\
13.0.3\
lib\
net6.0\… # the package's lib/net6.0 contents
netstandard2.0\…
Every distinct .nupkg in your dependency graph is stored exactly
once, regardless of how many projects depend on it.
Why NTFS hardlinks?
Once a package is in the store, pcpm install makes it visible to
dotnet restore by hardlinking it into ~/.nuget/packages. A
hardlink is a directory entry that points to the same on-disk inode
as another. Two paths, one file, zero extra disk.
NTFS (Windows), APFS (macOS), and ext4 (Linux) all support hardlinks, and PCPM uses them on the first two. On Linux, where the standard ext4 + libcs combination is fine with hardlinks too, PCPM uses hardlinks. The fallback — when the source and target are on different volumes, or when the filesystem doesn’t support hardlinks — is a normal file copy.
The hardlink is invisible to the application: dotnet restore sees
a normal ~/.nuget/packages/<id>/<version>/ tree. There’s nothing
custom about the consumer side.
How much disk do you save?
Rough numbers from a 30-project monorepo with 200 unique packages:
| Layout | Disk used |
|---|---|
Default ~/.nuget/packages (no PCPM) | 8.2 GB |
| PCPM, first run | 4.1 GB |
| PCPM, after a clean restore | 4.1 GB |
| PCPM, after adding a project that uses the same packages | 4.1 GB |
The PCPM number doesn’t grow with the number of projects. Adding a 30th project that uses the same 200 packages adds zero bytes to the store.
Why this matters
Disk is the obvious win, but the deeper benefit is I/O speed.
Restoring a project against a warm store is limited by metadata
operations: dotnet restore reading the lockfile, MSBuild
traversing the project graph, and the runtime opening the assembly
metadata. The actual copy-from-network step is gone. On a clean CI
runner with a warm cache, the wall-clock for pcpm install plus
dotnet restore is dominated by the second one, not the first.
What’s in the lockfile?
pcpm.lock records the resolved version and the content hash of
each package. This is what makes the store a “store” and not just a
cache: given a lockfile, PCPM can verify that every entry’s bytes
are present in the store, or download exactly the missing ones.
The format is a stable JSON document:
{
"version": 1,
"packages": [
{
"id": "newtonsoft.json",
"version": "13.0.3",
"hash": "sha256:8f3a4b5c6d7e…",
"dependencies": { … }
}
]
}
pcpm ci validates the hashes against the store and fails if
anything is missing. pcpm install re-resolves the graph and updates
the lockfile; the hashes get rewritten only if a transitive bump
changed a package’s content.
See also
- Dependency resolution — how PCPM picks the right version of every package.
- CLI:
pcpm store— subcommands for inspecting the store.