The Innate Immune Database (IIDB)
- Martin Korb†1,
- Alistair G Rust†1,
- Vesteinn Thorsson1,
- Christophe Battail2,
- Bin Li1,
- Daehee Hwang3,
- Kathleen A Kennedy1,
- Jared C Roach1,
- Carrie M Rosenberger1,
- Mark Gilchrist1,
- Daniel Zak1,
- Carrie Johnson1,
- Bruz Marzolf1,
- Alan Aderem1Email author,
- Ilya Shmulevich1Email author and
- Hamid Bolouri1Email author
© Korb et al; licensee BioMed Central Ltd. 2008
Received: 11 June 2007
Accepted: 05 March 2008
Published: 05 March 2008
As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site http://www.innateImmunity-systemsbiology.org. Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens.
We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser.
We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at http://db.systemsbiology.net/IIDB.
Extensive transcriptional regulation underlies macrophage responses to toll-like receptor (TLR) signaling . Differential transcriptional activity in response to TLR signaling tailors macrophage responses to different pathogens [2–8]. See  for a review. In spite of recent achievements [10–12], the cost and difficulty of comprehensive experimental identification of transcription factor binding sites (e.g. using ChIP-chip technology ) continues to be high. Computational methods can aid these efforts by predicting potential transcription factor-DNA interactions in response to pathogens. The information necessary for comprehensive prediction of transcription factor binding sites (TFBSs) on large numbers of genes is currently dispersed in publications [14–18] and across various databases such as ENSEMBL , GenBank , TRANSFAC , JASPAR , cisRED , and EPD .
While there are other examples of mouse genome annotation databases and websites [25, 26], in this paper, we present a database that is specifically focused on mouse innate immunity genes and their predicted regulatory interactions. The Innate Immune Database (IIDB) contains annotations for over 2000 genes, including over 1600 TLR-responsive genes, and additional genes considered of importance to innate immune responses. We have annotated these genes with data from over 150 microarray experiments, the ENSEMBL database, the database resources of the National Center for Biotechnology Information (NCBI), computationally predicted TFBSs, predicted cis- regulatory modules, evolutionary conserved regulatory sequences, DNase hypersensitive sites, co-expression clusters, and genome-wide chromatin immunoprecipitation (ChIP-chip) data. For each gene, we have analyzed the sequence from at least 20 kb upstream to at least 10 kb downstream of the predicted transcription end site, including all exons and introns. In addition to regulatory predictions, annotations for exon/intron boundaries, CpG islands, repeats, Affymetrix Microarray Expression probes are included. Users can interactively interrogate IIDB using a web interface. Search results are mapped to the genome sequence. Standard text-based files are created ('.gff2' and '.gff3') and can be explored visually using the web-enabled Argo genome browser .
Construction and Content
IIDB uses the MYSQL relational database system to store, retrieve and manage the data. The web interface between the user and IIDB is coded in PERL/CGI. Initially, the database was populated with a set of 1670 genes differentially regulated in response to LPS. We subsequently added a selection of approximately 400 other genes suspected or recognized to be of importance to the innate immune system. Approved (data curator) users can easily upload additional genes for analysis by providing the Entrez GeneID via the web interface. The uploading validation and annotation process is shown in Additional File 1. Uploaded genes are automatically annotated and made available via the web interface, usually within hours. At this time, we limit web based user submission to 300 kb, although longer loci can be handled by special request.
Sequence Coordinates and Gene Mapping
To accommodate the needs of our multiple ongoing experimental projects and users, IIDB provides annotations based on both the UCSC mouse genome version mm5  and the ENSEMBL mouse 29e build . The two gene maps differ in the number of predicted genes and other details. We built two distinct gene coordinate maps for IIDB, one based on UCSC data, and the other based on ENSEMBL. Gene loci are labeled by the gene name and corresponding Entrez geneID. A chromosome locus may contain more than one gene. Therefore IIDB allows multiple labels to be associated with a single locus. Locus coordinates were used to search the ENSEMBL database for repeats, CpG islands, and Affymetrix Microarray Expression ProbeSets (from the Mouse Genome 430 2.0 Array).
Predicted Individual Transcription Factor Binding Sites
We annotated the genomic sequence of all the genes in our database with each of the 360 mouse-specific individual TRANSFAC (Professional edition 8.3, ) Position Weight Matrices (PWM) using the MotifLocator algorithm. Scanning was performed on both the positive and negative strands. To assess the statistical significance of MotifLocator scores and to set selection thresholds, we evaluated MotifLocator scores on 200 kb of shuffled sequences from upstream regions of approximately 100 immune related genes. The randomization procedure was repeated six times for a total of 1.2 × 106 random scores per matrix. For each PWM, MotifLocator scores for the true sequence were converted to p-values by comparison to the score distribution for the same PWM on the randomized sequence. Based on this analysis we generated matrix scan datasets with p-values less than 1 × 10-3, 5 × 10-4 and 1 × 10-4.
Summary Presentation of Binding Sites for Similar Factors
Display of all individual putative TFBSs can produce cluttered visualizations. To provide an alternative representation, avoiding presentation of overlapping hits of identical or similar matrices, we grouped the 360 individual TRANSFAC matrices into 67 matrix families and 76 individual matrices. First, we computationally grouped TRANSFAC matrices whose identifiers correspond to the same transcription factor into a single matrix group. For example, AHR_Q5, AHR_01 both identify the Ahr transcription factor. Next, we computationally combined our matrix groups with TRANSFAC matrices whose identifiers indicated that they belong to the same transcription factor family. For example, matrices for MYC_Q2, EBOX_Q6, and MYCMAX_02 were combined. Finally, we hand curated the groups and remaining single matrices to ensure against computational false positives or false negatives. Any matrices not assigned into a group by the above procedure were retained as individual matrices.
To remove redundant predictions, overlapping hits from the same PWM group were collapsed into a single predictive hit if the two predicted TFBS overlapped by at least half of the length of the 5' matrix (|startmatrix1 - startmatrix2| <= lengthmatrix1/2). If the neighboring matrix did not satisfy this condition, it was marked as a distinct TFBS. The same algorithm was used to combine identical overlapping single matrix hits (see Additional File 2). This methodology significantly reduces the number of predicted TFBS and avoids visualization clutter. For example, combining individual matrices into matrix families reduced the number of predicted transcription factor binding sites from 2500 to 1795 at a p-value < 1 × 10-4 over the 54 Kbp sequence analyzed for the Interleukin 12b (IL12b) locus.
Spatial Clusters of TFBS
Transcription factors often bind in close proximity of each other within a cis- regulatory module [25, 26, 30]. We used the COBALT Clustering algorithm CCA (Battail C, Hwang D, Rust A, et al, manuscript in preparation) to identify statistically significant spatial clusters of matrix hits (cluster p-value ≤ 10-2). Briefly, the algorithm detects clusters of TFBS hits on a DNA regulatory region previously scanned by a library of matrices (e.g., from TRANSFAC or JASPAR). A "maximum cluster size" parameter limits the sequence length over which a cluster can extend. We chose 500 bp for this parameter based on typical lengths of known cis-regulatory modules in animal genes . A score is assigned to each cluster based on the motif scores that comprise the cluster and the spatial density of motifs. A "maximum motif overlap" parameter sets up the maximum percentage of overlap for two motifs to be considered individually. A list of cluster scores generated is compared to a list of cluster scores generated from a background DNA sequence. This comparison is then used to assign a p-value to each motif cluster according to its significance.
Conserved Human/Mouse/Rat/Dog Promoter Sequences
IIDB includes a catalog of over 26000 conserved human/rat/mouse and dog promoter sequences identified by Xie et al. . To map the reported conserved sequences to the promoter region of the genes in our database, we used a simple nucleotide search algorithm accepting only exact matches to the published sequences. Since Xie et al only investigated human promoters up to 2 Kb upstream of the transcription start site, we disregarded all mouse sequence matches located further than 3 kb upstream and 500 bp downstream of the transcription start site. In addition, we disregarded matches that were less than 100 nucleotides in length. The original list of conserved genomic sequences can be accessed at . As the state of the art and the data evolve, we anticipate periodic updates to annotations based on phylogenetic conservation.
Mouse Homologs of Human DNase Hypersensitive Sites
DNase hypersensitive (HS) sites can help identify the location of cis- regulatory regions on DNA [32, 33]. We used the multicross species DNase HS site mapping information as reported by Crawford et al  to create the chromosome coordinates for over 16500 possible mouse DNase HS sites. Briefly, we used the human DNase coordinate information, flanked by an additional 10 bp 5' and 3', to define a human HS sequence. This sequence was used as the input to the ENSEMBL Compara36 database to identify the coordinates of the matching mouse sequences. Only the top 12 mouse hits per human DNase HS sequence slice are used to populate our database. The original map of human DNase HS can be found at .
To demonstrate the ease of integrating additional data into IIDB, we have included data from a genome wide chromatin immunoprecipitation assay, employing a custom Affymetrix oligonucleotide array . This array contains densely-tiled 25-mer oligonucleotide sequences designed to interrogate almost all of the C57/BL6 mouse macrophage genes in IIDB. The raw data was processed with quantile normalization [36, 37], then filtered with a sliding window median filter to identify putative binding sites . IIDB includes both the tiling probe locations on all genes, and also the locations of statistically significant binding hits.
Utility and Discussion
Users can filter all TFBS predictions at any of three p-values: 1 × 10-3, 5 × 10-4 and 1 × 10-4. The more stringent p-values greatly reduce the occurrence of background (non-significant) matrix hits but may miss some true binding sites. Taking the IL12b gene as an example, we analyzed 54 kbp of sequence. The above methodology reduced the number of predicted TFBS from 15810 hits at a p-value threshold of 10-3 to 1795 at a p-value threshold of 10-4. We selected 10-4 as the most stringent p-value threshold because filtering at this level still allows exact matches to 67 out of 80 (84%) of known (TRANSFAC validated) TFBS's in our data set.
The results of the user's selections are temporarily stored in '.gff2' and '.gff3' file formats on our server and can be downloaded, saved to the user's computer and exported to other applications. Alternatively, the user can click on a link on the IIDB web page to invoke the Java Web Start enabled genome browser Argo (version 1.0.21, ). This facility relieves the user from the need to download/install any software.
For greater stringency of results, IIDB allows the user to filter transcription factor binding site predictions in several different ways as listed below:
Single Gene Analysis
All DNA sequences in our database have been annotated for exon/intron boundaries, repeats, CpG islands and Affymetrix probes. All sequences were scanned using the 360 TRANSFAC mouse-specific PWMs (Professional edition 8.3, ). Each gene in IIDB is marked with a set of character symbols indicating the availability of different kinds of data for that gene.
Macrophage TLR stimuli used in the experiments underlying IIDB
Unmethylated CpG motif (cytosine and guanine separated by a phosphate) bacterial DNA (TLR9-specific stimulant)
Lipopolysaccharide (component of the cell membrane of Gram-negative bacteria), TLR4-specific stimulant
Synthetic diacylated lipoprotein, TLR2/6 stimulant
Synthetic triacylated lipopeptide, TLR 2/1 stimulant
Polyriboinosinic polyribocytidylic acid (TLR3-specific stimulant)
Synthetic imidazoquinoline resiquimod, TLR 7, 8 stimulant
Search for TFBS
Search a Set of Genes for Shared TFBS
Members of a regulatory complex will often have tight spatial constraints on the relative locations of their TFBS. This expectation can be exploited to impose a stringent statistical filter on predicted TFBS's. IIDB users can search a group of potentially co-regulated genes for common transcription factor binding sites within a given distance from each other. The user can choose the genes, set the window size, and specify the length of the regulatory region to be analyzed.
Find Spatially Clustered TFBS on a Sequence
Mammalian cis- regulatory modules are thought to be typically around 500 bp in length and contain of the order of a dozen or more TFBSs [18, 30, 38]. This option (available as a tick box at the bottom of each search page) lets the user identify potential cis- regulatory regions on gene sequences by searching for statistically unlikely spatial clusters of TFBS's (as compared to TFBS patterns on shuffled sequences) for a range of window sizes.
Select Genes by GO Annotation
The user can select a set of IIDB annotated genes based on their common GO annotation [39, 40]. The resulting set can then be searched for predicted TFBS or to find a subset of genes with shared TFBS hits.
Explore ChIP-chip Data
As an example of additional data integration in IIDB, and to allow evaluation of the accuracy of our TFBS predictions, IIDB includes ChIP-chip data for the ATF3 transcription factor . We plan to include additional ChIP-chip data from other transcription factors as they become available.
Argo Genome Browser Display
An extensive help menu is available, as indicated by the '?' symbol next to each link in the navigation bar at the top of each web page. We also provide step-by-step examples of how to perform single and multi-gene analyses using IIDB via on-line help web pages (under the link 'How To Use IIDB') and through a downloadable tutorial (IIDB Tutorial link).
Comparison of IIDB TFBS predictions with ChIP-chip data. Data are presented based on the genome annotations available from both NCBI and ENSEMBL. Note that the annotations differ in the number of predicted genes.
Using NCBI coordinates
Using ENSEMBL coordinates
Promoter region mapped †
Number of genes
Unique ATF3 ChIP-chip hits
Conserved promoter regions containing ATF3 TFBS ◇
Percentile threshold ■
ATF3-group matrices hits*
ATF3-group matrices within a ChIP-chip segment
%overlap between ChIP-chip data & predictions ◉
Unlike other TFBS data and prediction repositories, IIDB is implemented to be specific in that it includes data relating to a specific cell type (macrophages) in a specific strain (C57/BL6) of a specific species (mus musculus). We hope that this specificity will prove useful for the immunology community. IIDB is also structured so that the data it contains will be generally useful to the broader immunology community.
We plan to regularly update the transcription factor binding site data and related statistics within IIDB. We are in the process of building an extended version of IIDB which will include TRANSFAC matrix scans for the entire mouse genome. We also plan to add further ChIP-chip data for various transcription factors we are currently analyzing. We are committed to regularly updating our gene coordinate system with new mouse genome builds. An email-based feedback and help link is provided on the IIDB homepage, and users are encouraged to provide suggestions to continue to refine the utility of IIDB. In this way, we hope IIDB will continue to grow, both in content, and also in its usefulness to the immunology community.
The current consensus view is that transcription factor binding site prediction based on PWM sequence scans alone is not sufficiently predictive for most systems biology projects. PWM scans generate very high numbers of false positives and numerous overlapping hits. We have used several TFBS prediction algorithms, multi-species conservation information, data on DNase HS sites, and searches based on TFBS meta patterns to reduce the number of hits and increase the predictive power of TFBS predictions. On the basis of available ChIP-chip data, TFBS predictions available via IIDB appear to have a good chance of being confirmed experimentally. We therefore believe IIDB will make a useful contribution to the immunology research community.
Our database currently includes predicted binding sites on the promoters of over 2000 mouse macrophage and immune-specific genes. Results from IIDB analyses of new genes, or new analyses of existing IIDB genes, can be automatically integrated into IIDB following curation. IIDB will grow with time and usage. We have customized a web-based genome browser to simultaneously display multiple genes with multiple annotations and TFBS predictions. Thus, IIDB can be used by researchers without specific computational expertise to develop novel gene regulatory hypotheses.
Availability and requirements
Project name: Innate Immune Database (IIBD).
Project home page: http://db.systemsbiology.net/IIDB
Operating system(s): Platform independent.
Programming languages: Perl/CGI, Java, MySQL.
Licence: free open-access to database via web-interface.
Restrictions to use by non-academics: none.
- bp :
DNA sequence base pairs (similarly, Kbp stands for kilo base pairs, and Kb for kilo bases), ChIP-chip:Chromatin Immunoprecipitation followed by microarray-based (chip) global identification of ChIP fragments,
- CCA :
Cobalt Clustering Algorithm (developed by C. Battail, A. Rust and H. Bolouri) to identify statistically significant spatial clusters of TFBS,
- IIDB :
Innate Immune Database
- HS :
DNase1 Hypersensitive Site,
- PWM :
Position Weight Matrix for a transcription factor,
- TF :
- TFBS :
Transcription Factor Binding Site on DNA,
- TLR :
We thank R. Engels at the Broad Institute for his assistance with Argo customization and E. Deutsch at the Institute for Systems Biology for technical advice. This research was supported in part by NIAID grants U54AI057160, 5K08AI056092, and NIGMS grant PM50 GMO76757.
- Aderem A, Ulevitch RJ: Toll-like receptors in the induction of the innate immune response. Nature. 2000, 406: 782-787. 10.1038/35021228.View ArticlePubMedGoogle Scholar
- Oda K, Kitano H: A comprehensive map of the toll-like receptor signaling network. Mol Syst Biol. 2006, 2: 2006.0015-10.1038/msb4100057.PubMed CentralView ArticlePubMedGoogle Scholar
- Nau GJ: Human macrophage activation programs induced by bacterial pathogens. Proc Natl Acad Sci USA. 2002, 99: 1503-1508. 10.1073/pnas.022649799.PubMed CentralView ArticlePubMedGoogle Scholar
- Nau GJ, Schlesinger A, Richmond JF, Young RA: Cumulative Toll-like receptor activation in human macrophages treated with whole bacteria. J Immunol. 2003, 170 (10): 5203-5209.View ArticlePubMedGoogle Scholar
- Rodriguez NE, Chang HK, Wilson ME: Novel program of macrophage gene expression induced by phagocytosis of Leishmania chagasi. Infect Immun. 2004, 72: 2111-2122. 10.1128/IAI.72.4.2111-2122.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Detweiler CS, Cunanan DB, Falkow S: Host microarray analysis reveals a role for the Salmonella response regulator phoP in human macrophage cell death. Proc Natl Acad Sci USA. 2001, 98: 5850-5855. 10.1073/pnas.091110098.PubMed CentralView ArticlePubMedGoogle Scholar
- Boldrick JC: Stereotyped and specific gene expression programs in human innate immune responses to bacteria. Proc Natl Acad Sci USA. 2002, 99: 972-977. 10.1073/pnas.231625398.PubMed CentralView ArticlePubMedGoogle Scholar
- Buer J, Balling R: Mice, microbes and models of infection. Nat Rev Genet. 2003, 4: 195-205. 10.1038/nrg1019.View ArticlePubMedGoogle Scholar
- Jenner RG, Young RA: Insights into Host responses against pathogens from transcriptional profiling. Nat Rev Microbiol. 2005, 3 (4): 281-94. 10.1038/nrmicro1126.View ArticlePubMedGoogle Scholar
- Natarajan M, Sternweis PC, Lin KM, Hsueh RC, The Alliance for Cellular Signaling Laboratories, Ranganathan R: A global analysis of cross-talk in a mammalian cellular signalling network. Nature Cell Biology. 2006, 8: 571-580. 10.1038/ncb1418.View ArticlePubMedGoogle Scholar
- Lee TI, Johnstone S, Young RA: Chromatin Immunoprecipitation and Microarray-Based Analysis of Protein Location. Nature Protocols. 2006, 1: 729-748. 10.1038/nprot.2006.98.PubMed CentralView ArticlePubMedGoogle Scholar
- Nilsson R, Bajic VB, Suzuki H, di Bernardo D, Bjorkegren J, Katayama S, Reid JF, Sweet MJ, Gariboldi M, Carninci P, Hayashizaki Y, Hume DA, Tegner J, Ravasi T: Transcriptional network dynamics in macrophage activation. Genomics. 2006, 88 (2): 133-42. 10.1016/j.ygeno.2006.03.022.View ArticlePubMedGoogle Scholar
- Horak CE, Snyder M: ChIP-chip: a genomic approach for identifying transcription factor binding sites. Methods Enzymol. 2002, 350: 469-83.View ArticlePubMedGoogle Scholar
- Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M: Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature. 434 (7031): 338-45. 10.1038/nature03441. 2005 Mar 17
- Lenhard B, Sandelin A, Mendoza L, Engstrom P, Jareborg N, Wasserman WW: Identification of conserved regulatory elements by comparative genome analysis. J Biol. 2003, 2 (2): 13-10.1186/1475-4924-2-13.PubMed CentralView ArticlePubMedGoogle Scholar
- Blanchette M, Tompa M: Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 2002, 12 (5): 739-48. 10.1101/gr.6902.PubMed CentralView ArticlePubMedGoogle Scholar
- Solovyev VV, Shahmuradov IA: PromH: Promoters identification using orthologous genomic sequences. Nucleic Acids Res. 2003, 31 (13): 3540-5. 10.1093/nar/gkg525.PubMed CentralView ArticlePubMedGoogle Scholar
- Wagner A: Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes. Bioinformatics. 1999, 15 (10): 776-84. 10.1093/bioinformatics/15.10.776.View ArticlePubMedGoogle Scholar
- Curwen V, Eyras E, Andrews DT, Clarke L, Mongin E, Searle S, Clamp M: The ENSEMBL automatic gene annotation system. Genome Res. 2004, 14: 934-941. 10.1101/gr.1858004.PubMed CentralView ArticlePubMedGoogle Scholar
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL: GenBank. Nucleic Acids Res. 2006, D16-20. 10.1093/nar/gkj157. 34 Database
- Wingender E, Dietze P, Karas H, Knuppel R: TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 1996, 24 (1): 238-241. 10.1093/nar/24.1.238.PubMed CentralView ArticlePubMedGoogle Scholar
- Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B: JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 2004, D91-4. 10.1093/nar/gkh012. [http://jaspar.genereg.net]32 Database
- Robertson G, Bilenky M, Lin K, He A, Yuen W, Dagpinar M, Varhol R, Teague K, Griffith OL, Zhang X, Pan Y, Hassel M, Sleumer MC, Pan W, Pleasance ED, Chuang M, Hao H, Li YY, Robertson N, Fjell C, Li B, Montgomery SB, Astakhova T, Zhou J, Sander J, Siddiqui AS, Jones SJ: cisRED: a database system for genome-scale computational discovery of regulatory elements. Nucleic Acids Res. 2006, D68-73. 10.1093/nar/gkj075. 34 Database
- Cavin PR, Junier T, Bucher P: The Eukaryotic Promoter Database EPD. Nucleic Acids Res. 1998, 26 (1): 353-7. 10.1093/nar/26.1.353.View ArticleGoogle Scholar
- Blanchette M, Bataille AR, Chen X, Poitras C, Laganiere J, Lefebvre C, Deblois G, Giguere V, Ferretti V, Bergeron D, Coulombe B, Robert F: Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 2006, 16 (5): 656-68. 10.1101/gr.4866006.PubMed CentralView ArticlePubMedGoogle Scholar
- Sharov AA, Dudekula DB, Ko MS: CisView: a browser and database of cis-regulatory modules predicted in the mouse genome. DNA Res. 2006, 13 (3): 123-34. 10.1093/dnares/dsl005.View ArticlePubMedGoogle Scholar
- Argo Genome Browser. [https://www.broad.harvard.edu/annotation/argo/]
- Kent WJ, Sugnet CW, Furey TS, Roskin K, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res. 2002, 12: 996-1006. 10.1101/gr.229102. Article published online before print in May 2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Motif Locator software. [http://www.esat.kuleuven.be/~thijs/download.html]
- Davidson EH: The Regulatory Genome: Gene Regulatory Networks In Development And Evolution. 2006, Academic Press, San DiegoGoogle Scholar
- Human/mouse/rat/dog evolutionary conserved regulatory motifs from Reference 14. [http://www.broad.mit.edu/seq/HumanMotifs/]
- Lee MS, Garrard WT: Transcription-induced nucleosome 'splitting': an underlying structure for DNase I sensitive chromatin. EMBO J. 1991, 10 (3): 607-15.PubMed CentralPubMedGoogle Scholar
- Crawford GE, Holt IE, Mullikin JC, Tai D, Blakesley R, Bouffard G, Young A, Masiello C, Green ED, Wolfsberg TG, Collins FS, National Institutes Of Health Intramural Sequencing Center: Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. Proc Natl Acad Sci USA. 2004, 101 (4): 992-7. 10.1073/pnas.0307540100.PubMed CentralView ArticlePubMedGoogle Scholar
- DNase hypersensitive sites database. [http://research.nhgri.nih.gov/DNaseHS/May2005/]
- Gilchrist M, Thorsson V, Li B, Rust AG, Korb M, Kennedy K, Hai T, Bolouri H, Aderem A: Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4. Nature. 2006, 441 (7090): 173-8. 10.1038/nature04768.View ArticlePubMedGoogle Scholar
- Cawley S, Bekiranov S, Ng HH, Kapranov P, Sekinger EA, Kampa D, Piccolboni A, Sementchenko V, Cheng J, Williams AJ, Wheeler R, Wong B, Drenkow J, Yamanaka M, Patel S, Brubaker S, Tammana H, Helt G, Struhl K, Gingeras TR: Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell. 2004, 116 (4): 499-509. 10.1016/S0092-8674(04)00127-8.View ArticlePubMedGoogle Scholar
- Kim TH, Barrera LO, Zheng M, Qu C, Singer MA, Richmond TA, Wu Y, Green RD: Ren. A high-resolution map of active promoters in the human genome. Nature. 2005, 436 (7052): 876-80. 10.1038/nature03877.PubMed CentralView ArticlePubMedGoogle Scholar
- Bailey TL, Noble WS: Searching for statistically significant regulatory modules. Bioinformatics. 2003, 19 (Suppl 2): II16-II25.View ArticlePubMedGoogle Scholar
- Gene Ontology Consortium: The Gene Ontology (GO) project in 2006. Nucleic Acids Res. 2006, D322-6. 10.1093/nar/gkj021. 34 Database
- The Gene Ontology Consortium, Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. Nat Genet. 2000, 25 (1): 25-9. 10.1038/75556.PubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.