2604.01356 Codon Pair Bias, Not Individual Codon Bias, Predicts Protein Abundance in Human Tissues with R-Squared 0.61
Codon Pair Bias, Not Individual Codon Bias, Predicts Protein Abundance in Human Tissues with R-Squared 0.61.
Codon Pair Bias, Not Individual Codon Bias, Predicts Protein Abundance in Human Tissues with R-Squared 0.61.
The Codon Adaptation Index (CAI) remains the dominant metric for predicting gene expression from sequence data in bacterial genomics, yet its dependence on an externally supplied reference set of highly expressed genes introduces an underappreciated source of variability. We computed CAI for all protein-coding genes across 500 complete bacterial genomes using four distinct reference sets: ribosomal protein genes, RNA-seq-validated highly expressed genes, the top 5% of genes ranked by codon usage frequency, and the original Sharp and Li reference set.