Fix CI pipeline + open-source cleanup (symlinks, py3.10, README, badges)#1
Conversation
The four results/dim_c/{geneformer,scgpt,transcriptformer,uce}/predicted_edges.csv
files were committed as absolute symlinks pointing to
/root/hf_cache/embeddings/dim_c/<model>/predicted_edges.csv, a path that only
exists in the original (root) generation environment.
On GitHub Actions, actions/setup-python (cache: pip) globs the repo tree to hash
dependency files; that traversal follows these symlinks into /root (mode 700,
not searchable by the runner user) and fails with EACCES, killing the 'tests'
and 'baselines-drift' jobs before any test runs. The 'docker' job runs as root
so it was unaffected.
No test depends on these files: the referencing tests in tests/test_correctness.py
and tests/test_phase_outputs.py pytest.skip when the CSV is missing, and CI only
runs tests/unit/. The derived grn_eval_*.json results remain in place.
The raw predicted-edge CSVs for these foundation models were never committed to
the public repo (only the private symlink was).
|
Warning Review limit reached
More reviews will be available in 34 minutes and 8 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. 📝 WalkthroughWalkthroughAdds Changestomllib backport and README update
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- arc_state.py imported tomllib unconditionally, but tomllib is stdlib only on Python >= 3.11, so the unit-test collection crashed on the 3.10 CI matrix leg (ModuleNotFoundError: No module named 'tomllib'). Use the standard try/except fallback to the 'tomli' backport, and declare 'tomli; python_version < 3.11' as a dependency. requires-python is >=3.9 and 3.9/3.10 are advertised in the classifiers, so the backport (not dropping 3.10) is the correct fix. - README: remove the stale 'Repository access' note that described the repo as a private, invite-only company mirror during peer review. The repository is public and open source; that paragraph no longer applies.
- README: add shields.io badge row (MIT license, bioRxiv preprint, Hugging Face models & data, CI tests status, appliedscientific.ai lab) mirroring the CardioSafe-benchmark style for org consistency. - README: add ## License and ## Citation sections at the bottom with the preprint DOI (10.64898/2026.06.18.733146) and a BibTeX entry. - CITATION.cff: replace the 'DOI available from the repository' note with the actual preprint doi/url fields.
- Relabel preprint badge bioRxiv -> DOI (per maintainer preference) and add a Hugging Face Leaderboard Space badge (huggingface.co/spaces/appliedscientific/vcbench-leaderboard). - Citation: use the published title 'VCBench: A Multi-Dimensional Benchmark for Single-Cell Foundation Models' and the real author list (Weidener, Brkić, Jovanović, Ulgac, Meduri; Applied Scientific Intelligence, Inc.). Drop the bioRxiv journal field in favour of the DOI. - CITATION.cff: same authors/title; corresponding-author email on Weidener.
Bundles the full release-prep cleanup into one PR.
1. Fix failing CI — broken private symlinks
tests+baselines-driftdied in ~11s atactions/setup-python(cache: pip) with:Four
results/dim_c/{geneformer,scgpt,transcriptformer,uce}/predicted_edges.csvfiles were committed as absolute symlinks to/root/hf_cache/...(the original root generation env). The pip-cache glob follows them into/root(mode 700, not searchable byrunner) → EACCES before any test runs. Removed the dead symlinks. No test depends on them (referencing testspytest.skipwhen missing; CI only runstests/unit/). Derivedgrn_eval_*.jsonresults are untouched.2. Fix failing CI —
tomllibon Python 3.10vcbench.models.arc_stateimportedtomllibunconditionally, but it's stdlib only on Python ≥ 3.11, so unit-test collection crashed on the 3.10 matrix leg. Added the standardtry: import tomllib / except: import tomli as tomllibfallback and declaredtomli; python_version < '3.11'. (requires-python is>=3.9, so the backport is the correct fix, not dropping 3.10.)3. Open-source cleanup
(private mirror); homepage set to https://appliedscientific.ai; topics added.4. README badges + Citation
## License+## Citationsections with preprint DOI10.64898/2026.06.18.733146and BibTeX;CITATION.cffupdated with the doi/url.Result
All CI jobs green:
baselines-drift, pytest 3.10/3.11/3.12, docker.VCBench contributors— replace with the formal author list.appliedscientificorg — repoint to a specific repo if preferred.