Discovery of novel transcription factor binding sites by statistical overrepresentation

Understanding the complex and varied mechanisms that regulate gene expression is an important and challenging problem. A fundamental sub‐problem is to identify DNA binding sites for unknown regulatory factors, given a collection of genes believed to be co‐regulated. We discuss a computational method...

Full description

Saved in:
Bibliographic Details
Published inNucleic acids research Vol. 30; no. 24; pp. 5549 - 5560
Main Authors Sinha, Saurabh, Tompa, Martin
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.12.2002
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Understanding the complex and varied mechanisms that regulate gene expression is an important and challenging problem. A fundamental sub‐problem is to identify DNA binding sites for unknown regulatory factors, given a collection of genes believed to be co‐regulated. We discuss a computational method that identifies good candidates for such binding sites. Unlike local search techniques such as expectation maximization and Gibbs samplers that may not reach a global optimum, the method discussed enumerates all motifs in the search space, and is guaranteed to produce the motifs with greatest z‐scores. We discuss the results of validation experiments in which this algorithm was used to identify candidate binding sites in several well studied regulons of Saccharomyces cerevisiae, where the most prominent transcription factor binding sites are largely known. We then discuss the results on gene families in the functional and mutant phenotype catalogs of S.cerevisiae, where the algorithm suggests many promising novel transcription factor binding sites. The program is available at http://bio.cs.washington.edu/software.html.
Bibliography:local:gkf669
Received July 26, 2002; Accepted September 17, 2002
istex:87CBA895CA513A5C8732C454895D7D816701A227
ark:/67375/HXZ-X71F9H28-G
To whom correspondence should be addressed. Tel: +1 206 543 9263; Fax: +1 206 543 8331; Email: tompa@cs.washington.edu
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:0305-1048
1362-4962
1362-4962
DOI:10.1093/nar/gkf669