Oct 20, 2008

Public Chip-Seq Data

Here are some Chip-Seq data sets that have been published and are out there in the public domain.



NHLBI

Valouev et al, Sidow lab @ Stanford, 

Robertson et al, 2007, Nature Methods  4(8) 651-7.
Eland processed sequence reads and FindPeaks output for Stat1 and FoxA2 transcription factors






2 comments:

Anonymous said...

Another published ChIP-seq data set (from Chen et al. Cell 2008 http://dx.doi.org/10.1016/j.cell.2008.04.043)

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11431

This record contains genomic coordinate files for ChIP peaks for about 15 (mostly stemness-related) transcription factors in mouse embryonic stem cells.

velikkakam said...

I would like to visualise raw ( All possible binding sites ) data from few literateurs and since i don't have much knowledge about the data formate I got confused. Eg: GSM162004 ( since the experiment entries : GSE7065 from the .tar file is in .CEL (binary) format, i downloaded individual txt format)
x1086_y1242_r2_Tag_2 89.25
x1087_y1242_r2_Tag_2 149.25
x1086_y1241_r2_Tag_10 141.25
and
GSM506211_090313_s_3_seq_WKK13.soa:
1 36 - chr3 2635473
1 36 + chrC 26323
1 36 + chr3 4070645

How can I convert them to gff/wig file? I made a guess like this
GSM162004.gff
r2 . tarray 2 26 89.25 . . Name=r2_Tag_2
r2 . tarray 2 26 149.25 . . Name=r2_Tag_2
r2 . tarray 10 34 141.25 . . Name=r2_Tag10

GSM506211:
chr3 2635473 2635438 1
chrC 26323 26358 1
chr3 4070645 4070680 1

Is that correct?? In both case I can see some unusual pattern. In the GSM162004 for the reference seq I have the values like 25S 5S Actin chloroplast chr1 chr2 gcBin03 gcBin04 gcBin05 ... and for GSM506211 the score is always 1. Could you please guide me?