Aug 1, 2012

I led a workshop on 7/31/2012 about NGS for Drug Development at this Hanson Wade conference in Boston:

NGS Bioinformatics for Drug Developers

I made a huge PPT slide deck to fill 2 hours, but in the workshop we ended up in a multi-way discussion for most of the time, so I never got to show half the slides.  I am posting them on BOX.com
so I don't feel like the effort was wasted.

NGS drug dev PPT slides

Jun 8, 2012

MiSeq replaces 454 in Microbiome projects

One of my Bioinformatics students, Laura Cox, is working on a Human Microbiome Project  study with Martin Blaser. Yesterday, she presented a lab report to a standing room only crowd about her sequencing work on bacterial populations using the Illumina MiSeq machine. Up till now, HMP work was about the only sequencing that we consistently ran on the 454 machine. Laura showed that with the MiSeq paired-end 150 bp sequencing protocol, it was possible to sequence 16S amplicons (in the V4 region) from both ends and stitch them together using the ea-utils FASTQ-join [Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; http://code.google.com/p/ea-utils] to get about 260 bp reads on each amplicon. Laura used a custom multiplex scheme to get 192 different samples into one MiSeq run, which after demultiplexing, gave about 2,000 P-E reads per sample. 


She also demonstrated that the resulting sequence data could be processed with QIIME [http://www.qiime.org] to get reasonable taxonomy information, build phylogenetic trees, and apply all the cute tools to calculate diversity and compare groups of samples by PCA and UNIFRAC [http://bmf2.colorado.edu/unifrac].


The economics of MiSeq are persuasive. It is giving amplicon data at about 40X less cost than 454. As our HMP protocols shift over to MiSeq, this will be the last year that we keep the 454 machine in the Genomics Core Lab. 



Mar 14, 2012

Collaborative work on Exome SNPs

I have noticed that a fair number of people who actually work with Next-Gen Sequence data read this blog, so perhaps we can use it for a collaborative project.

I want to write a paper about uneven coverage in exome sequencing leading to incorrect SNP calls. Our data is from tumor-normal pairs, and we see a lot of false negatives - failure to detect a SNP in a sample due to low coverage at that spot. Exome capture methods seem to have more than their fair share of low coverage spots (even with an average coverage over 100x), and these low coverage spots do differ somewhat from sample to sample. I'd like some other people to share data with us and/or do some similar analysis on other data sets so that we can make a stronger paper.

Feb 20, 2012

$18,000 Cancer Genome

Illumina has announced an updated and improved cancer genome sequencing service for $18,000. This will provide genome sequencing of the tumor at 80x and of normal tissue from the same patient at 40x. This is similar to the coverage offered by Complete Genomics for a similar service (at $12K). Illumina also offers a novel sample prep method (by partnership with the Broad Inst.) for very small samples and FFPE.

Perhaps the most interesting thing about the Illumina service is the bioinformatics support, which will include a new variant detection algorithm that looks at both the tumor and normal together, in order to reduce the false positives. The standard approach, available from other software, does variant detection separately for each sample, then tries to subtract the variants found in the normal from those found in the tumor. This method works very poorly, since many variants cannot be called accurately in tumor samples which contain various amounts of normal tissue mixed in as well as tumor genomic heterogeneity. Many other existing variants are simply not called in the normal sample (ie. false negatives) due to poor coverage, poor quality, nearby insertion/deletions or any other feature that fails stringent variant detection software.  We have been working on this same approach to the problem, but Illumina brings a much bigger team with a access to a LOT more data. Illumina will also provide custom annotation of discovered variants provided by a team of human bioinformaticians (rather than just running the data through a static annotation software pipeline).

I think this is a more realistic milestone for clinical sequencing than the mythical $1000 genome. Cancer patients are one of the few (common) clinical scenarios where whole genome sequencing could really pay off with actionable discoveries - allowing genetic information to be used to chose targeted drugs and other interventions. A simple genome sequence (at whatever coverage) of a healthy person does not provide much medically actionable data today. Furthermore, the informatics that can currently be applied to a single, cheaply acquired, genome sequence range from relatively inexpensive but simplistic one-size-fits-all software pipelines to the equally mythical $100,000 interpretation (presumably provided by a dedicated project team of expert informaticians and medical geneticists).

Dec 15, 2011

Job Opening: Sequencing Informatics Scientist

One of my very good informatics people is leaving at the end of the year, so I have a job vacancy to fill in our Sequencing Informatics unit (funded as an Institutional core to support our Nex-Gen Sequencing Lab and our investigators, not from any one grant). I want someone with either a Masters and some experience with Next-Gen sequencing informatics, or a PhD. (bioinformatics or computer science, or something similar) who is looking for a more stable, service oriented position, rather than the usual highly competitive postdoc.  There will be opportunity for both collaborative and independent work on various projects, and publications are expected. UNIX/Perl/Java skills are necessary.

The job previously involved informatics support for 454 sequencing, but that turned out to be less than 30% of the actual work. Looking forward, our Microbiome work will be done mostly on Illumina, bacterial genomes on Illumina... you get the idea. Send cv's to stuart.brown@gmail.com