These boil down to four basic functions:
- peak calling - taking sequence reads aligned to a reference genome and counting the number of hits per genome interval, subtracting background or a control lane, smoothing, cutting off shoulders, splitting double peaks, and coming up with some statistic that suggests that the peaks are real vs. false positives
- annotation - finding the location of peaks on the genome as compared to known features, especially the transcription start sites of known genes
- visualization - looking at peaks in one of the genome browsers
- motif detection - finding patterns of common bases within the peaks, comparing these patterns with known transcription factor binding sites
We have evaluated quite a few different pieces of software that supply various of these functions:
"An integrated software system for analyzing ChIP-chip and ChIP-seq data"
Ji H, Jiang H, Ma W, Johnson DS, Myers RM, Wong WH.
BC Cancer Agency: FindPeaks
This is a good peak finder, easy to use, with a reasonable statistical model (based on comparison of your genome mapped data vs. a MonteCarlo random distribution of tags)
SISSRS (Site Identification from Short Sequence Reads)
Makes use of +/- strand information in Chip-Seq reads to precisely identify transcription factor binding sites within a few tens of base pairs.
Jothi R, Cuddapah S, Barski A, Cui K, Zhao K. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 2008 Sep;36(16):5221-31.
written by Yong Zhang and Tao Liu from the lab of Shirley Liu at Harvard
C++ program (requires C++ compiler) - author Anton Valouev in Sidow lab at StanfordGenome-wide analysis of transcription factor binding sites based on ChIP-Seq data, Valouev, et al. Nature Methods 5, 829 - 834 (2008)
Wold Lab software suite (@ Caltech)
Peak finder and visualization via UCSB Genome Browser
MIT Integrative Genome Viewer
note the alignment processor that creates tag counts from Next-Gen aligned reads (such as Eland output files)
Web-based peak calling at the Swiss Institute of Bioinformatics
ChIPDiff - identification of differential histone modification sites by comparison of two ChIP-Seq libraries prepared from different tissues (various cell types, stages, or environmental responses). Uses a Hidden Markov Model to identify differences in ChIP tag counts.http://bioinformatics.oxfordjournals.org/cgi/content/full/24/20/2344
Available from Genome Institue of Singapore