Skip to content

Fix CI pipeline + open-source cleanup (symlinks, py3.10, README, badges)#1

Merged
mihailoxyz merged 4 commits into
mainfrom
fix/ci-remove-private-symlinks
Jun 24, 2026
Merged

Fix CI pipeline + open-source cleanup (symlinks, py3.10, README, badges)#1
mihailoxyz merged 4 commits into
mainfrom
fix/ci-remove-private-symlinks

Conversation

@mihailoxyz

@mihailoxyz mihailoxyz commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Bundles the full release-prep cleanup into one PR.

1. Fix failing CI — broken private symlinks

tests + baselines-drift died in ~11s at actions/setup-python (cache: pip) with:

EACCES: permission denied, stat '.../results/dim_c/geneformer/predicted_edges.csv'

Four results/dim_c/{geneformer,scgpt,transcriptformer,uce}/predicted_edges.csv files were committed as absolute symlinks to /root/hf_cache/... (the original root generation env). The pip-cache glob follows them into /root (mode 700, not searchable by runner) → EACCES before any test runs. Removed the dead symlinks. No test depends on them (referencing tests pytest.skip when missing; CI only runs tests/unit/). Derived grn_eval_*.json results are untouched.

2. Fix failing CI — tomllib on Python 3.10

vcbench.models.arc_state imported tomllib unconditionally, but it's stdlib only on Python ≥ 3.11, so unit-test collection crashed on the 3.10 matrix leg. Added the standard try: import tomllib / except: import tomli as tomllib fallback and declared tomli; python_version < '3.11'. (requires-python is >=3.9, so the backport is the correct fix, not dropping 3.10.)

3. Open-source cleanup

  • README: removed the stale 'Repository access' note describing the repo as a private, invite-only company mirror under peer review.
  • Repo metadata: description no longer says (private mirror); homepage set to https://appliedscientific.ai; topics added.

4. README badges + Citation

  • Badge row: MIT license, bioRxiv preprint, Hugging Face, CI tests, appliedscientific.ai lab (mirrors CardioSafe-benchmark style).
  • ## License + ## Citation sections with preprint DOI 10.64898/2026.06.18.733146 and BibTeX; CITATION.cff updated with the doi/url.

Result

All CI jobs green: baselines-drift, pytest 3.10/3.11/3.12, docker.

⚠️ Needs author confirmation before merge

  • BibTeX/CITATION author field is the placeholder VCBench contributors — replace with the formal author list.
  • Hugging Face badge points to the appliedscientific org — repoint to a specific repo if preferred.
  • Raw predicted-edge CSVs for the 4 foundation models were never committed publicly (only the private symlink). Derived results are present; add real files / host on HF if the raw edges should ship.

The four results/dim_c/{geneformer,scgpt,transcriptformer,uce}/predicted_edges.csv
files were committed as absolute symlinks pointing to
/root/hf_cache/embeddings/dim_c/<model>/predicted_edges.csv, a path that only
exists in the original (root) generation environment.

On GitHub Actions, actions/setup-python (cache: pip) globs the repo tree to hash
dependency files; that traversal follows these symlinks into /root (mode 700,
not searchable by the runner user) and fails with EACCES, killing the 'tests'
and 'baselines-drift' jobs before any test runs. The 'docker' job runs as root
so it was unaffected.

No test depends on these files: the referencing tests in tests/test_correctness.py
and tests/test_phase_outputs.py pytest.skip when the CSV is missing, and CI only
runs tests/unit/. The derived grn_eval_*.json results remain in place.

The raw predicted-edge CSVs for these foundation models were never committed to
the public repo (only the private symlink was).
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@mihailoxyz, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 34 minutes and 8 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6bfb5886-73f0-4ba5-8939-066b1a672f32

📥 Commits

Reviewing files that changed from the base of the PR and between 4ea0c3c and 2e60022.

📒 Files selected for processing (2)
  • CITATION.cff
  • README.md
📝 Walkthrough

Walkthrough

Adds tomli>=2.0 as a conditional runtime dependency for Python versions before 3.11, and updates arc_state.py to import tomllib from stdlib or fall back to tomli. The README first-time-visitor blurb is replaced with a v1.0.0 notice linking to the public Arc State checkpoint.

Changes

tomllib backport and README update

Layer / File(s) Summary
tomllib backport for Python < 3.11
pyproject.toml, src/vcbench/models/arc_state.py
Adds tomli>=2.0 as a conditional dependency for python_version < '3.11' and replaces the direct tomllib import with a try/except block that falls back to tomli on Python 3.9/3.10.
README first-time-visitor notice
README.md
Removes the peer-review repository-privacy notice and substitutes a v1.0.0 precision reconciliation message with a link to the public Arc State HuggingFace checkpoint and CPU reproduction snippets.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

A rabbit hops through Python's land,
Where tomllib needs a helping hand.
For versions old before eleven,
tomli falls back like manna from heaven.
README updated, the world can see —
v1.0.0 precision, hopping free! 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the main themes of the change set: CI cleanup, symlink removal, Python 3.10 compatibility, and README updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ci-remove-private-symlinks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

- arc_state.py imported tomllib unconditionally, but tomllib is stdlib only on
  Python >= 3.11, so the unit-test collection crashed on the 3.10 CI matrix leg
  (ModuleNotFoundError: No module named 'tomllib'). Use the standard try/except
  fallback to the 'tomli' backport, and declare 'tomli; python_version < 3.11'
  as a dependency. requires-python is >=3.9 and 3.9/3.10 are advertised in the
  classifiers, so the backport (not dropping 3.10) is the correct fix.

- README: remove the stale 'Repository access' note that described the repo as a
  private, invite-only company mirror during peer review. The repository is
  public and open source; that paragraph no longer applies.
- README: add shields.io badge row (MIT license, bioRxiv preprint, Hugging Face
  models & data, CI tests status, appliedscientific.ai lab) mirroring the
  CardioSafe-benchmark style for org consistency.
- README: add ## License and ## Citation sections at the bottom with the
  preprint DOI (10.64898/2026.06.18.733146) and a BibTeX entry.
- CITATION.cff: replace the 'DOI available from the repository' note with the
  actual preprint doi/url fields.
@mihailoxyz mihailoxyz changed the title fix(ci): remove dead absolute symlinks to private /root/hf_cache Fix CI pipeline + open-source cleanup (symlinks, py3.10, README, badges) Jun 24, 2026
- Relabel preprint badge bioRxiv -> DOI (per maintainer preference) and add a
  Hugging Face Leaderboard Space badge (huggingface.co/spaces/appliedscientific/vcbench-leaderboard).
- Citation: use the published title 'VCBench: A Multi-Dimensional Benchmark for
  Single-Cell Foundation Models' and the real author list (Weidener, Brkić,
  Jovanović, Ulgac, Meduri; Applied Scientific Intelligence, Inc.). Drop the
  bioRxiv journal field in favour of the DOI.
- CITATION.cff: same authors/title; corresponding-author email on Weidener.
@mihailoxyz mihailoxyz merged commit cea03a4 into main Jun 24, 2026
6 checks passed
@mihailoxyz mihailoxyz deleted the fix/ci-remove-private-symlinks branch June 24, 2026 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant