Algorithmic biology conference is at UCSD today and RECOMB Satellites tommorow. I will give a summary of keynote talks from Manolis Kellis, Ron Shamir, David Haussler and Serafim Batzoglou.
Manolis Kellis MIT "Interpreting the human genome"
Sequence signatures of highly accurate synteny alignments between human, mouse, worm, rat and fly to discover, refine, and refute (annotation errors) genes in each species. This can also be used to find miRNA elements.
Also look at conservation of regulatory motifs across 4 species. The trick is that to determine if a motif is significant (using a Motif Conservation Score, MCS) they compare its occurrence across the 4 genomes, but also occurrences throughout the genome.
Developed a method called SPIDIR to build phylogenies from gene sequences based on likelihood of the observed tree based on two mutation rates, gene-family specific rate and species-specific rate. This method outperformed other traditional methods dramatically.
Ron Shamir, Tel Aviv University "Some Current Computational Challenges in Biology and Medicine"
Uses biclustering of genes and conditions to find modules that are tightly condition-specifically regulated. Then use de novo motif finding to find motifs across all yeasts using ortholog projection. The patterns of occurrence of these motifs follow known evolutionary traces, and also show the emergence of coordination/evolution of motifs.
They also study the development of high-accuracy P-value calculation of association score of case-control studies. RAT, rapid association test to solve this. Unfortunately didn’t go into how it works, but went directly to the results but it seems to use a reduced search space for sampling random genotypes to estimate a p-value. Clearly much faster than regular methods, but he only gave a few anecdotal examples of convergence.
David Haussler, HHMI and UCSC "Reconstructing 100 Million Years of Human Evolutionary History"
He started with a great example of the FOXP2 gene which separates human speech from others. There’s one amino acid that’s different. Found evidence for positive selection for a gene expressed during brain development. It has 18 changes in the region between humans and chimps etc. Turns our this regions a structural RNA sequence, not a protein. They did in situ hybridization that its expressed in the same cortical layer in human and macauque embryos. Also work in reconstructing the Boreoeutherian genome. They use weak lacZ promoters to test enhancer elements which were discovered based on long stretches of conserved DNA. Nature Rubin 2006. These ultra-conserved elements are conserved between species, but show some mutations in the population..this means they aren’t just mutational cold spots, but selected against in humans for some reason. There are hundreds of these in the genome that are conserved only in vertebrates, up to and including the “missing link” fish which was the bridge between water and land animals.
Seraphim Batzoglou, Stanford "Models and Algorithms for Genomic Sequences, Proteins, and Networks of Protein Interactions" Conditional random fields for protein alignment. Do not depend on BLOSUM, which is already intricately related to the test sets already. They model insertion, deletions and matches as three states and used sequence features. Gaps are too be found near hydrophilic amino acids. http://contra.stanford.edu/contralign
Made integrated networks for 305 microbes, with 2.81 million interactions. Also made a method for network alignment of these networks called Graemlin. The results are reports to be better than MaWish and Netowrk BLAST and recapitulating MIPS complexes.
All together, this was a pretty good day at the Algorithmic Biology conference. This is the first of three days of conferences at UCSD, with RECOMB Systems Biology and Algorithmic Mass Spec friday and saturday. Stay tuned from reports from each day!