Pinned Loading
-
grounding-atlas
grounding-atlas PublicMeasurement-first map of biological content-grounding in language models: does the model ground a specialist's output by content or by name, and where should each capability live (train / retrieve …
Python
-
verify-or-trust
verify-or-trust PublicA verifiable-reward agentic benchmark: does an LLM correctly allocate verification when orchestrating a fallible biology foundation model?
Python
-
causalatlas
causalatlas PublicMeasurement study of causal grounding in LLM-orchestrated single-cell perturbation foundation models.
Python
-
narrow-model-safety-eval
narrow-model-safety-eval PublicEmpirical dual-use risk assessment of protein language models (ESM-2) and structure-based design tools (ProteinMPNN)
Python
-
bio-sfm-trust-audit
bio-sfm-trust-audit PublicAudit framework for LLM trust-routing over biological science foundation model outputs.
Python
-
llm-sfm-safety-eval
llm-sfm-safety-eval PublicDefensive safety evaluation of the LLM x science-foundation-model interpretation channel: harness, redacted aggregate results, and measurement specs behind four findings on deployed Claude refusal …
Python
If the problem persists, check the GitHub status page or contact support.


