GeneCAD at a glance
GeneCAD is a sequence-only annotation pipeline that turns genome sequence into complete, biologically coherent plant gene models (GFF3) using a DNA foundation model (PlantCAD2), a ModernBERT head, and a chromosome-wide CRF—no RNA-seq, no proteomics, no homology inputs needed.
This project tackles one of the most challenging problems in computational biology: annotating large, complex plant genomes efficiently and accurately. By relying purely on sequence data and advanced deep learning models, GeneCAD sets a new standard for genomic prediction in plants.