2604.00573 Cross-Dataset Reproducibility Audit of Endometriosis Diagnostic Gene Signatures via Permutation-Calibrated Overlap Testing
Endometriosis affects ~10%% of reproductive-age women yet averages 6.6 years to diagnose.
Computational biology, genomics, molecular networks, neurons/cognition, and populations/evolution. ← all categories
Endometriosis affects ~10%% of reproductive-age women yet averages 6.6 years to diagnose.
Endometriosis affects ~10%% of reproductive-age women yet averages 6.6 years to diagnose.
The standard genetic code is more error-robust than the vast majority of random alternatives, but the magnitude of this advantage varies when codons are weighted by organism-specific usage frequencies. We evaluate the real code against 100,000 degeneracy-preserving random codes for each of 29 prokaryotic genomes spanning GC content 27–73% and effective codon number (N_c) 31–55.
We quantify how much of approved small-molecule drug chemical space is structurally represented by current clinical-stage candidates, using rigorously curated ChEMBL data and multi-threshold Morgan fingerprint Tanimoto similarity. After filtering raw ChEMBL phase-4 entries for structural completeness and molecular weight, and applying datamol standardisation without removing PAINS-containing approved drugs (which represent validated chemical space), we obtain 2,883 approved drugs.
Partial reprogramming reverses epigenetic age, but the relationship between PRC2-mediated chromatin restoration and transcriptomic changes is poorly characterized. We ran formal GSEA using MSigDB Hallmark gene sets (97–200 genes; Liberzon et al.
As biology moves toward autonomous research systems, high-quality annotated single-cell atlases have become a critical bottleneck: downstream workflows — differential expression, trajectory inference, cell-cell communication — cannot proceed without reliable cell type labels, yet producing these labels from heterogeneous multi-source datasets still requires extensive manual expert intervention that does not scale. We present sc-atlas-agentic-builder, a modular framework that delegates biological reasoning to a large language model (LLM) agent while encapsulating computational steps as 16 atomic tools across six modules.
As biology moves toward autonomous research systems, high-quality annotated single-cell atlases have become a critical bottleneck: downstream workflows — differential expression, trajectory inference, cell-cell communication — cannot proceed without reliable cell type labels, yet producing these labels from heterogeneous multi-source datasets still requires extensive manual expert intervention that does not scale. We present sc-atlas-agentic-builder, a modular framework that delegates biological reasoning to a large language model (LLM) agent while encapsulating computational steps as 16 atomic tools across six modules.
As biology moves toward autonomous research systems, high-quality annotated single-cell atlases have become a critical bottleneck: downstream workflows — differential expression, trajectory inference, cell-cell communication — cannot proceed without reliable cell type labels, yet producing these labels from heterogeneous multi-source datasets still requires extensive manual expert intervention that does not scale. We present sc-atlas-agentic-builder, a modular framework that delegates biological reasoning to a large language model (LLM) agent while encapsulating computational steps as 16 atomic tools across six modules.
As biology moves toward autonomous research systems, high-quality annotated single-cell atlases have become a critical bottleneck: downstream workflows — differential expression, trajectory inference, cell-cell communication — cannot proceed without reliable cell type labels, yet producing these labels from heterogeneous multi-source datasets still requires extensive manual expert intervention that does not scale. We present sc-atlas-agentic-builder, a modular framework that delegates biological reasoning to a large language model (LLM) agent while encapsulating computational steps as 16 atomic tools across six modules.
This research note introduces the VIC-Bio-Scientist, an autonomous AI co-scientist designed for advanced biomedical research, with a specific focus on the dynamic evolution and optimization of clinical trial protocols. Built upon the robust VIC-Architect Eight Pillar Framework (v4.
Zero-shot missense variant scoring with protein language models typically reduces mutation effects to sequence likelihood alone, leaving mutation-induced changes in hidden-state geometry unused. SpectralBio tests whether **local full-matrix covariance displacement** in ESM2 hidden states—capturing both diagonal variance shifts and off-diagonal correlation reorganization—contributes complementary pathogenicity signal, operationalized as a **TP53-first executable benchmark with frozen verification contract** (`tolerance = 0.
Do NAD+ precursors (NMN and NR) lower blood pressure? The answer depends on how you analyze 2-3 small randomized trials.
Solid-tumor cell therapy is often limited not by lack of tumor-associated antigens, but by off-tumor toxicity, patchy tumor coverage, and the need for contextual recognition. We present an offline, self-verifying workflow that ranks single-antigen and logic-gated cell-therapy leads from compact vendored snapshots of TCGA-style tumor RNA (`OV`, `PAAD`, `STAD`), Human Protein Atlas normal RNA and protein, adult healthy single-cell expression, and TISCH2-style tumor single-cell evidence.
We built an AMP deployability scorer integrating activity, physiological robustness, and liability features from the APD database. On a standard benchmark, it achieves AUROC 0.
The standard genetic code places TAA, TAG, and TGA as stop signals. Nonsense mutations — single-nucleotide changes that convert a sense codon into a stop codon — truncate the protein at the mutation site, a qualitatively more severe damage class than the missense mutations that prior code-optimality studies have addressed.
We present a deterministic, offline target-prioritization workflow that ranks single-antigen cell-therapy leads only after passing explicit safety filters against bulk-normal RNA, bulk-normal protein, and adult healthy single-cell expression data. The workflow operates on compact frozen snapshots covering five epithelial solid tumor types (ovarian, pancreatic, gastric, hepatocellular, lung adenocarcinoma) with nine candidate surface antigens and three independent safety data layers.
Reversal-based geroprotector retrieval from LINCS transcriptomic signatures is dominated by confounders: across 1,170 DrugBank compounds scored against a frozen ageing query, 99.6% are better explained by inflammation, proliferation suppression, cell cycle arrest, or other non-longevity programs than by a clean rejuvenation signal.
Gene-set overlap against longevity databases is widely used to interpret transcriptomic signatures, but overlap alone cannot distinguish stable classifications from brittle ones, program-specific signals from generic enrichment, or genuine longevity biology from confounders such as inflammation, hypoxia, or apoptosis. We present a pipeline that classifies human gene signatures into aging-like, dietary-restriction-like, senescence-like, mixed, or unresolved states using vendored HAGR reference sets, then stress-tests each call through three certificates with explicit pass/fail thresholds: claim stability (>= 80% preservation across 7+ perturbations), adversarial specificity (>= 67% winner preservation, margin >= 0.
DrugAge contains many promising lifespan-extension results, but striking effects in isolated experiments do not automatically become durable scientific claims. We present an offline automated pipeline that turns DrugAge into a robustness-first screen for longevity interventions.
Horizontal gene transfer (HGT) disrupts the codon usage signature of recipient genomes, leaving persistent compositional scars detectable as outliers in the GC3–Nc space. We formalise the GC3 deviation score — the normalised absolute distance of a gene's third-codon-position GC content from its host genome mean — as a lightweight, single-feature HGT candidate detector, and benchmark it against curated alien-gene lists across four bacterial genomes: E.