Skip to content
N Neural Genomes
Genomics Machine Learning Foundation Models

GeneCAD: Plant Genome Annotation

A sequence-only annotation pipeline that turns genome sequences into complete, biologically coherent plant gene models (GFF3).

2026

GeneCAD at a glance

GeneCAD is a sequence-only annotation pipeline that turns genome sequence into complete, biologically coherent plant gene models (GFF3) using a DNA foundation model (PlantCAD2), a ModernBERT head, and a chromosome-wide CRF—no RNA-seq, no proteomics, no homology inputs needed.

This project tackles one of the most challenging problems in computational biology: annotating large, complex plant genomes efficiently and accurately. By relying purely on sequence data and advanced deep learning models, GeneCAD sets a new standard for genomic prediction in plants.