Skip to content

[Security] Arbitrary host-file disclosure when loading an untrusted checkpoint: tensorstore file kvstore follows symlinked chunk files with no containment #5487

Description

@geo-chen

Reporting here as confirmed with Google bug hunters:


We've reviewed it, and while we appreciate you flagging this, we need to let you know that we're no longer offering rewards for product vulnerabilities like this one in projects that fall into the OT2 or OT3 tiers. The Flax repository, https://github.com/google/flax, is currently categorized in this way for reward eligibility. You're still welcome to open an issue or submit a pull request directly on the GitHub repo if you'd like to help get this fixed!

Summary

A Flax/Orbax checkpoint is a directory the publisher fully controls. Array leaves are stored as tensorstore zarr3 chunk files and read back through the tensorstore file kvstore, which follows filesystem symlinks with no containment to the checkpoint directory. If a chunk file inside a published checkpoint is a symlink to an arbitrary host path, calling restore_checkpoint reads that file's bytes directly into the restored array.

Loading a malicious checkpoint therefore silently discloses arbitrary host files (SSH private keys, cloud-credential files, /etc/passwd, tokens) into the model — no code execution required, no trust_remote_code-style opt-in required. This is the checkpoint-directory analog of the tar/zip symlink-traversal class (cf. CVE-2007-4559, Python tarfile filter='data').

Details

flax.training.checkpoints.restore_checkpoint dispatches modern checkpoints to Orbax:

restored = orbax_checkpointer.restore(ckpt_path, item=target, **restore_kwargs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions