A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets

Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them....

Full description

Saved in:
Bibliographic Details
Published inACS synthetic biology Vol. 10; no. 6; pp. 1394 - 1405
Main Authors Wilson, Erin H, Groom, Joseph D, Sarfatis, M. Claire, Ford, Stephanie M, Lidstrom, Mary E, Beck, David A. C
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 18.06.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them. In this work, we developed a computational framework that can leverage standard RNA-seq data sets to identify sets of constitutive, strongly expressed genes and predict strong promoter signals within their upstream regions. The framework was applied to a diverse collection of RNA-seq data measured for the methanotroph Methylotuvimicrobium buryatense 5GB1 and identified 25 genes that were constitutively, strongly expressed across 12 experimental conditions. For each gene, the framework predicted short (27–30 nucleotide) sequences as candidate promoters and derived −35 and −10 consensus promoter motifs (TTGACA and TATAAT, respectively) for strong expression in M. buryatense. This consensus closely matches the canonical E. coli sigma-70 motif and was found to be enriched in promoter regions of the genome. A subset of promoter predictions was experimentally validated in a XylE reporter assay, including the consensus promoter, which showed high expression. The pmoC, pqqA, and ssrA promoter predictions were additionally screened in an experiment that scrambled the −35 and −10 signal sequences, confirming that transcription initiation was disrupted when these specific regions of the predicted sequence were altered. These results indicate that the computational framework can make biologically meaningful promoter predictions and identify key pieces of regulatory systems that can serve as foundational tools for engineering diverse microorganisms for biomolecule production.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2161-5063
2161-5063
DOI:10.1021/acssynbio.1c00017