|

Steven L. Salzberg, Ph.D
Professor
Department of Computer Science
University of Maryland
February 20 at 3:00pm
106 Woodward
Abstract:
Genome sequencing methods that produce sequences shorter than 50 nucleotides present significant challenges to genome assembly and sequence alignment algorithms. When attempting to assemble these short reads, most assemblers will produce highly fragmented assemblies, with breaks occurring at the location of every repetitive sequence longer than a read. We have developed a new assembly algorithm that overcomes most of the major difficulties of short-read assembly. One of the key innovations is the use of predicted genes to span gaps, which we call gene-boosted assembly [1]. This method is particularly effective for gene-dense species including bacteria and viruses.
Using our new algorithm in conjunction with several other techniques, we assembled over 8.6 million reads, each of them 33 nt in length, from a bacterial genome sequenced with an Illumina Genome Analyzer. We were able to assemble the genome into fewer than 100 large contigs. The consensus sequence accuracy is >99.97%, and over 97% of the genes are contained within contigs.
In the second part of my talk, I will address the problem of rapid alignment of short reads to the human genome. We have developed a new program, Bowtie, based on the Burroughs-Wheeler Transform, that aligns short reads at very high speed with very modest memory requirements. Bowtie is able to align reads to the human genome using only a standard desktop workstation, with performance benchmarks that are dozens to hundreds of times faster than competing systems. If time permits I will also discuss a second program, TopHat, which aligns short reads from RNA-Seq experiments to a genome, allowing for the alignments that are interrupted by introns.
This talk describes joint work with Dan Sommer, Daniela Puiu, Ben Langmead, Mihai Pop, Cole Trapnell, and Lior Pachter.
1. S.L. Salzberg, D.D. Sommer, D. Puiu, and V.T. Lee. Gene-Boosted Assembly of a Novel Bacterial Genome from Very Short Reads. PLoS Computational Biology 4:9 (2008): e1000186.
Bio:
Steven Salzberg is the Director of the Center for Bioinformatics and Computational Biology (CBCB) and the Horvitz Professor of Computer Science at the University of Maryland, College Park. From 1997 to 2005 he was at The Institute for Genomic Research (TIGR) in Rockville, Maryland, where he was the Senior Director of Bioinformatics. During that time he was also a Research Professor of Computer Science and Biology at Johns Hopkins University. Dr. Salzberg received his B.A., M.S., and M.Phil. degrees from Yale University, and his Ph.D. in Computer Science from Harvard University. He joined the Computer Science Department at Johns Hopkins as an Assistant Professor in 1989.
As part of his gene finding research beginning in the 1990s, Salzberg and his colleagues built the Glimmer system for bacterial gene-finding, which has become one of the world's most successful and widely-used gene finders. Glimmer has been used in hundreds of bacterial, archaeal, and viral genome projects, including Bacillus anthracis, Borrelia burgdorferi, Treponema pallidum, Vibrio cholerae, and Mycobacterium tuberculosis. Eukaryotic gene finders developed by Salzberg's group have been applied to the genomes of animals (including human), plants, and eukaryotic parasites including Plasmodium falciparum (malaria) and Brugia malayi, and Trypanosoma brucei. Salzberg’s lab has also developed algorithms for large-scale genome sequence alignment and genome assembly, including the AMOS genome assembler and the MUMmer alignment system. All of their software is free and open-source, and has been downloaded by thousands of users around the globe.
Dr. Salzberg has authored or co-authored over 150 scientific journal publications and two books. He is a Fellow of the American Association for the Advancement of Science (AAAS) and a past member of the Board of Scientific Counselors of the National Center for Biotechnology Information at NIH. He currently serves on the Editorial Boards of the journals BMC Biology, Journal of Computational Biology, PLoS ONE, BMC Genomics, BMC Bioinformatics, Biology Direct, and Evolutionary Bioinformatics Online, and is a member of the Faculty of 1000.
Back
|

|
|
Copyright © 2003
- 2008 College of Computing and Informatics
|
|