reading genomes, bit by bit

We want to be able to read a genome like a book. We are at a remarkable time in biology where at last we can see the source code for life — the complete genomic DNA sequences that specify development, regulation, and function of organisms — but, frustratingly, we are still far from adequately understanding how to read this vast trove of encoded information, and far from being able to reconstruct how it evolved. Our laboratory develops computational methods for genome sequence analysis, and in particular for identifying remote evolutionary relationships between distantly related protein and RNA sequences.

recent publications

how to reach us

HHMI Janelia Research Campus
19700 Helix Drive
Ashburn, VA 20147, USA
Phone: 571.209.4000 [this is the general Janelia operator number, not an office]


Hidden Markov models for sequence profile analysis.


RNA structure analysis using covariance models.


Database of protein family alignments and hidden Markov models.


The Rfam database of RNA alignments, consensus secondary structures, and profile SCFGs.


The Dfam database of repetitive DNA sequence elements.