Aug 24, 2017

Gene expression analysis shows that an AHR2 mutant fish population from the Hudson River has a dramatically reduced response to dioxin



Together with Ike Wirgin in the NYUMC Dept. of EnvironmentalMedicine, we just published a paper on a gene expression study of a fish population in the Hudson River in Genome Biology and Evolution. "A Dramatic Difference inGlobal Gene Expression between TCDD-Treated Atlantic Tomcod Larvae from theResistant Hudson River and a Nearby Sensitive Population"



Atlantic tomcod (Microgadus tomcod) is a fish that Ike has been studying for many years as an indicator of biological responses to toxic pollution of the Hudson River estuary which contains two of the largest Superfund sites in the nation because of PCB and dioxin (TCDD)contamination. It was previously shown that tomcod from the Hudson River had extraordinarily high prevalence of liver tumors in the 1970's, exceeding 90% in older fish.  But in 2006 they found that the Hudson River population of this fish had many fewer tumors and a 100-fold reduction of induction of the Cytochrome P450 pathway in response to dioxin and PCB exposure (Wirgin and Chambers 2006). In a 2011 Science paper, they reported a two amino acid deletion of the AHR2 gene in the Hudson River population that was absent, or nearly so, in all other tomcod populations.  The aryl hydrocarbon receptor is responsible for induction of CYP genes in all vertebrates and in activation of most toxicities from these contaminants (Wirginet al 2011).


Our goal for this project was to build a de novo sequence of the genome of the tomcod, annotate all the genes, and do a global analysis of gene expression (with RNAseq) to look at the genome-wide effects of the AHR2 mutation in Hudson River larvae as compared to wild type fish (collected from a clean location at Shinnecock Bay, on the South Shore of Long Island, in the Hamptons). All DNA and RNA sequencing was done at the NYUMC Genome Technology Center under the direction of Adriana Heguy

From a bioinformatics point of view, this project was interesting because we decided to integrate both a genome assembly and multiple transcriptome assemblies to get the most complete set of full-length protein coding genes.  For the transcriptome, we did de novo assembly with rnaSPAdes on many different RNAseq samples including embryo, juvenile, and adult liver.  We made the genome assembly with SoapDenovo2, and then predicted gene coding regions with GLIMMER-HMM. We combined all of these different sets of transcript coding sequences with the EvidentialGene pipeline created by Don Gilbert. With the final merged set of transcripts, we used Salmon to (very quickly) quasi-map the reads in each RNAseq sample onto the transcriptome and quantify gene expression.  Differential gene expression was computed with edgeR.

The results were extremely dramatic. At low doses of dioxin, the wild type larvae show a huge gene expression response, with about a thousand genes having large fold-changes (some key genes were validated by qRT-PCR). The mutant Hudson River larvae basically ignore these low doses, with almost no gene expression changes. At the highest does (1 ppb), the Hudson River fish show some gene expression changes, but mostly not in the same genes as in the wild type fish.  Even the negative control larvae (not treated with dioxin) show a large difference in gene expression between the two populations.