Together with Ike Wirgin in the NYUMC Dept. of EnvironmentalMedicine, we just published a paper on a gene expression study of a fish
population in the Hudson River in Genome Biology and Evolution. "A Dramatic Difference inGlobal Gene Expression between TCDD-Treated Atlantic Tomcod Larvae from theResistant Hudson River and a Nearby Sensitive Population"
Atlantic tomcod (Microgadus tomcod) is a fish that Ike has been studying for many years as an indicator of biological responses to toxic pollution of the Hudson River estuary which contains two of the largest Superfund sites in the nation because of PCB and dioxin (TCDD)contamination. It was previously shown that tomcod from the Hudson River had extraordinarily high prevalence of liver tumors in the 1970's, exceeding 90% in older fish. But in 2006 they found that the Hudson River population of this fish had many fewer tumors and a 100-fold reduction of induction of the Cytochrome P450 pathway in response to dioxin and PCB exposure (Wirgin and Chambers 2006). In a 2011 Science paper, they reported a two amino acid deletion of the AHR2 gene in the Hudson River population that was absent, or nearly so, in all other tomcod populations. The aryl hydrocarbon receptor is responsible for induction of CYP genes in all vertebrates and in activation of most toxicities from these contaminants (Wirginet al 2011).
Our goal for this project was to build a de novo sequence of the genome of the
tomcod, annotate all the genes, and do a global analysis of gene expression
(with RNAseq) to look at the genome-wide effects of the AHR2 mutation in Hudson
River larvae as compared to wild type fish (collected from a clean location at
Shinnecock Bay, on the South Shore of Long Island, in the Hamptons). All DNA
and RNA sequencing was done at the NYUMC Genome Technology Center under the
direction of Adriana Heguy.
From a bioinformatics point of view, this project was interesting
because we decided to integrate both a genome assembly and multiple
transcriptome assemblies to get the most complete set of full-length protein
coding genes. For the transcriptome, we
did de novo assembly with rnaSPAdes
on many different RNAseq samples including embryo, juvenile, and adult
liver. We made the genome assembly with
SoapDenovo2, and then predicted gene coding regions with GLIMMER-HMM. We
combined all of these different sets of transcript coding sequences with the
EvidentialGene pipeline created by Don Gilbert. With the final merged set of
transcripts, we used Salmon to (very quickly) quasi-map the reads in each
RNAseq sample onto the transcriptome and quantify gene expression. Differential gene expression was computed
with edgeR.
The results were extremely dramatic. At low doses of dioxin, the wild type larvae show a huge gene expression response, with about a thousand genes having large fold-changes (some key genes were validated by qRT-PCR). The mutant Hudson River larvae basically ignore these low doses, with almost no gene expression changes. At the highest does (1 ppb), the Hudson River fish show some gene expression changes, but mostly not in the same genes as in the wild type fish. Even the negative control larvae (not treated with dioxin) show a large difference in gene expression between the two populations.