Predicting substrate specificity of adenylation domains of nonribosomal peptide synthetases and other protein properties by latent semantic indexing

Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the...

Full description

Saved in:

Bibliographic Details
Published in	Journal of industrial microbiology & biotechnology Vol. 41; no. 2; pp. 461 - 467
Main Authors	Baranašić, Damir, Zucko, Jurica, Diminic, Janko, Gacesa, Ranko, Long, Paul F, Cullum, John, Hranueli, Daslav, Starcevic, Antonio
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer-Verlag 01.02.2014 Oxford University Press Springer Berlin Heidelberg
Subjects	amino acid sequences Amino Acids Amino Acids - chemistry Analysis Biochemistry Bioinformatics Biomedical and Life Sciences Biosynthesis Biotechnology Catalytic Domain chemistry classification Decomposition Documents Genetic Engineering Genomes Genomics Inorganic Chemistry Kinases Life Sciences ligases Linear algebra metabolism Metabolites methods Microbiology Original Article Peptide Synthases Peptide Synthases - chemistry Peptide Synthases - classification Peptide Synthases - metabolism Peptides prediction Proteins Semantics Sequence Alignment Sequence Analysis, Protein Sequence Analysis, Protein - methods Studies Substrate Specificity Substrates Adenylation domains Functional subtype LSI NRPS Protein tokenization
Online Access	Get full text

Cover

Loading…

Abstract	Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ‘‘documents’’ and it is necessary to encode properties of the amino acid sequence as ‘‘terms’’ in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ‘‘concept’’ dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine.
AbstractList	Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ‘‘documents’’ and it is necessary to encode properties of the amino acid sequence as ‘‘terms’’ in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ‘‘concept’’ dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine. Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ‘‘documents’’ and it is necessary to encode properties of the amino acid sequence as ‘‘terms’’ in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ‘‘concept’’ dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine. Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ''documents'' and it is necessary to encode properties of the amino acid sequence as ''terms'' in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ''concept'' dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Aa of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine. Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ''documents'' and it is necessary to encode properties of the amino acid sequence as ''terms'' in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ''concept'' dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine.Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ''documents'' and it is necessary to encode properties of the amino acid sequence as ''terms'' in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ''concept'' dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine. Issue Title: Special Issue: Microbial Genome Mining Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ''documents'' and it is necessary to encode properties of the amino acid sequence as ''terms'' in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ''concept'' dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine.[PUBLICATION ABSTRACT] Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ''documents'' and it is necessary to encode properties of the amino acid sequence as ''terms'' in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ''concept'' dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine. Abstract Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional subtypes (e.g., with different substrates). In many cases, there are only a small number of known functional subtypes, but in the case of the adenylation domains of nonribosomal peptide synthetases (NRPS), there are >500 known substrates. Latent semantic indexing (LSI) was originally developed for text processing but has also been used to assign proteins to families. Proteins are treated as ‘‘documents’’ and it is necessary to encode properties of the amino acid sequence as ‘‘terms’’ in order to construct a term-document matrix, which counts the terms in each document. This matrix is then processed to produce a document-concept matrix, where each protein is represented as a row vector. A standard measure of the closeness of vectors to each other (cosines of the angle between them) provides a measure of protein similarity. Previous work encoded proteins as oligopeptide terms, i.e. counted oligopeptides, but used no information regarding location of oligopeptides in the proteins. A novel tokenization method was developed to analyze information from multiple alignments. LSI successfully distinguished between two functional subtypes in five well-characterized families. Visualization of different ‘‘concept’’ dimensions allows exploration of the structure of protein families. LSI was also used to predict the amino acid substrate of adenylation domains of NRPS. Better results were obtained when selected residues from multiple alignments were used rather than the total sequence of the adenylation domains. Using ten residues from the substrate binding pocket performed better than using 34 residues within 8 Å of the active site. Prediction efficiency was somewhat better than that of the best published method using a support vector machine.
Author	Cullum, John Long, Paul F Zucko, Jurica Hranueli, Daslav Starcevic, Antonio Diminic, Janko Gacesa, Ranko Baranašić, Damir
Author_xml	– sequence: 1 fullname: Baranašić, Damir – sequence: 2 fullname: Zucko, Jurica – sequence: 3 fullname: Diminic, Janko – sequence: 4 fullname: Gacesa, Ranko – sequence: 5 fullname: Long, Paul F – sequence: 6 fullname: Cullum, John – sequence: 7 fullname: Hranueli, Daslav – sequence: 8 fullname: Starcevic, Antonio
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/24104398$$D View this record in MEDLINE/PubMed
BookMark	eNqNkstu1TAQhiNURC_wAGzAEhsWBHxPskQVN6kSSNC15TiTg6scO9iOxHkPHpgJKaiqRGFle_L9k9-e_7Q6CjFAVT1m9CWjtHmVGeWdqikTNROc1_xedcJko2ulhDrCvdBNraRQx9VpzleUUtU0_EF1zCWjUnTtSfXjU4LBu-LDjuSlzyXZAiTP4PzonS8HEkdiBwiHyRYfAxni3vqQ1zK6Sb6PGSsTmWEufkDpIZSvUGyGTGwYSMRTInOKBXxY1xlS8fixPxBsCaGQDHsbinfEhwG-o5OH1f3RThkeXa9n1eXbN1_O39cXH999OH99UTstVak7xaHthWNsHDvVt2CVxiu6odMOq7pzou04bx1limqEqGwdZ6zRbSe1HMRZ9Xzri7a-LZCL2fvsYJpsgLhkwxQ-lGItPuE_Udlh11Zo-j8oQztSNog-u4VexSUFvPNKUS24YB1ST66ppd_DYObk9zYdzO8pIsA2wKWYc4LxD8KoWZNitqQYTIpZk2I4appbGpz2rxFjBPx0p5Jvyox_CTtIN0zfIXqxieIy_8XdjRwj_nTDRxuN3SWfzeVnTplcM8x1I8VPsBzoTg
CitedBy_id	crossref_primary_10_1016_S1875_5364_24_60585_6 crossref_primary_10_1016_j_bbapap_2023_140972 crossref_primary_10_1093_nar_gkx319 crossref_primary_10_3390_md22080349 crossref_primary_10_1039_C6NP00025H crossref_primary_10_1039_D3NP00064H crossref_primary_10_1021_acs_jnatprod_5b00932 crossref_primary_10_1038_nchembio_1884 crossref_primary_10_1016_j_ijmm_2014_02_001 crossref_primary_10_3389_fchem_2018_00394 crossref_primary_10_3762_bjoc_20_253 crossref_primary_10_1007_s00253_023_12585_2 crossref_primary_10_1016_j_synbio_2015_12_002 crossref_primary_10_1002_prot_24922 crossref_primary_10_1016_j_cbpa_2022_102212 crossref_primary_10_1021_acs_jnatprod_0c01186 crossref_primary_10_1093_bioinformatics_btx400 crossref_primary_10_1038_nchembio_1890 crossref_primary_10_1126_science_aaw6732 crossref_primary_10_4103_epj_epj_46_23 crossref_primary_10_1007_s12602_020_09688_x crossref_primary_10_1039_D1OB00772F crossref_primary_10_1007_s10295_018_2084_7 crossref_primary_10_1016_S1875_5364_20_60006_1 crossref_primary_10_1039_D4NP00073K crossref_primary_10_1099_mgen_0_000621 crossref_primary_10_1007_s12038_017_9663_z crossref_primary_10_1186_s12915_020_00873_6 crossref_primary_10_1039_C6NP00030D crossref_primary_10_1021_acschembio_2c00761 crossref_primary_10_1016_j_biocontrol_2020_104530 crossref_primary_10_1093_nar_gkv1012 crossref_primary_10_1038_s41598_018_37792_0 crossref_primary_10_1128_Spectrum_00571_21
Cites_doi	10.1371/journal.pcbi.1000069 10.1186/1471-2105-10-421 10.1016/S1074-5521(99)80082-9 10.1016/S1074-5521(00)00091-0 10.1093/nar/gki885 10.1016/j.sbi.2010.01.009 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9 10.1006/jmbi.2000.4036 10.1093/nar/gkr323 10.1007/s10295-013-1252-z 10.1186/1471-2105-10-335 10.1093/bioinformatics/btm404 10.1093/nar/gkn685
ContentType	Journal Article
Copyright	Society for Industrial Microbiology 2014 2014 Society for Industrial Microbiology and Biotechnology 2013 Society for Industrial Microbiology and Biotechnology 2014
Copyright_xml	– notice: Society for Industrial Microbiology 2014 2014 – notice: Society for Industrial Microbiology and Biotechnology 2013 – notice: Society for Industrial Microbiology and Biotechnology 2014
DBID	FBQ AAYXX CITATION CGR CUY CVF ECM EIF NPM 3V. 7QL 7QR 7T7 7WY 7WZ 7X7 7XB 87Z 88A 88E 88I 8AO 8FD 8FE 8FH 8FI 8FJ 8FK 8FL 8G5 ABUWG AEUYN AFKRA AZQEC BBNVY BENPR BEZIV BHPHI C1K CCPQU DWQXO FR3 FRNLG FYUFA F~G GHDGH GNUQQ GUQSH HCIFZ K60 K6~ K9. L.- LK8 M0C M0S M1P M2O M2P M7N M7P MBDVC P64 PHGZM PHGZT PJZUB PKEHL PPXIY PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI PRINS Q9U 7X8 7QO 7S9 L.6
DOI	10.1007/s10295-013-1322-2
DatabaseName	AGRIS CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed ProQuest Central (Corporate) Bacteriology Abstracts (Microbiology B) Chemoreception Abstracts Industrial and Applied Microbiology Abstracts (Microbiology A) ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Health & Medical Collection ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Biology Database (Alumni Edition) Medical Database (Alumni Edition) Science Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni) ProQuest Research Library ProQuest Central (Alumni) ProQuest One Sustainability (subscription) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Collection ProQuest Central Business Premium Collection Natural Science Collection Environmental Sciences and Pollution Management ProQuest One ProQuest Central Korea Engineering Research Database Business Premium Collection (Alumni) Health Research Premium Collection ABI/INFORM Global (Corporate) Health Research Premium Collection (Alumni) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection ProQuest Health & Medical Complete (Alumni) ABI/INFORM Professional Advanced Biological Sciences ABI/INFORM Global Health & Medical Collection (Alumni) Medical Database Research Library Science Database Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database Research Library (Corporate) Biotechnology and BioEngineering Abstracts ProQuest Central Premium ProQuest One Academic ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic MEDLINE - Academic Biotechnology Research Abstracts AGRICOLA AGRICOLA - Academic
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ProQuest Business Collection (Alumni Edition) Research Library Prep ProQuest Central Student ProQuest Central Essentials SciTech Premium Collection ProQuest Central China ABI/INFORM Complete Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Natural Science Collection Health & Medical Research Collection Biological Science Collection Chemoreception Abstracts Industrial and Applied Microbiology Abstracts (Microbiology A) ProQuest Central (New) ProQuest Medical Library (Alumni) Business Premium Collection ABI/INFORM Global ProQuest Science Journals (Alumni Edition) ProQuest Biological Science Collection ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest Business Collection ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Engineering Research Database ProQuest One Academic ProQuest One Academic (New) ABI/INFORM Global (Corporate) ProQuest One Business Technology Research Database ProQuest One Academic Middle East (New) ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing Research Library (Alumni Edition) ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Biology Journals (Alumni Edition) ProQuest Central ABI/INFORM Professional Advanced ProQuest Health & Medical Research Collection Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) ProQuest Research Library ABI/INFORM Complete (Alumni Edition) ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest SciTech Collection ProQuest Medical Library ProQuest One Business (Alumni) ProQuest Central (Alumni) Business Premium Collection (Alumni) MEDLINE - Academic Biotechnology Research Abstracts AGRICOLA AGRICOLA - Academic
DatabaseTitleList	CrossRef Engineering Research Database MEDLINE - Academic ProQuest Business Collection (Alumni Edition) MEDLINE AGRICOLA
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database – sequence: 4 dbid: FBQ name: AGRIS url: http://www.fao.org/agris/Centre.asp?Menu_1ID=DB&Menu_2ID=DB1&Language=EN&Content=http://www.fao.org/agris/search?Language=EN sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Biology Chemistry
DocumentTitleAlternate	Special Issue: Microbial Genome Mining
EISSN	1476-5535
EndPage	467
ExternalDocumentID	3186315721 24104398 10_1007_s10295_013_1322_2 10.1007/s10295-013-1322-2 US201400052674
Genre	Research Support, Non-U.S. Gov't Journal Article
GroupedDBID	--- -4W -56 -5G -BR -EM -Y2 -~C .86 .VR 06C 06D 0R~ 0VY 199 1N0 1SB 2.D 203 28- 29K 29~ 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 36B 3SX 3V. 4.4 408 409 40D 40E 4P2 53G 5GY 5QI 5VS 67N 67Z 6NX 78A 7WY 7X7 88A 88E 88I 8AO 8CJ 8FE 8FH 8FI 8FJ 8FL 8G5 8TC 8UJ 95- 95. 95~ 96X A8Z AAAVM AABHQ AABYN AAIAL AAJKR AANXM AAPXW AARHV AARTL AATVU AAVAP AAWCG AAYIU AAYQN AAYTO ABBBX ABBXA ABDBF ABDZT ABFGW ABFTV ABHLI ABJOX ABKCH ABKTR ABMNI ABNWP ABPLI ABPTD ABPTK ABQBU ABTEG ABTHY ABTMW ABULA ABUWG ABXPI ACBMV ACBRV ACBXY ACBYP ACGFS ACGOD ACHXU ACIHN ACIPQ ACIWK ACKNC ACMLO ACOKC ACOMO ACPRK ACREN ACSNA ADBBV ADHHG ADHIR ADINQ ADKPE ADMDM ADOAH ADRFC ADURQ ADYFF ADYOE ADYPR ADZKW AEAQA AEBTG AEFIE AEGAL AEGNC AEJHL AEJRE AEKMD AENEX AEOHA AEPYU AETLH AEVTX AEXYK AFEXP AFGCZ AFKRA AFLOW AFNRJ AFRAH AFULF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGBP AGGDS AGJBK AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHMBA AHSBF AHYZX AI. AIIXL AILAN AIMYW AITGF AJBLW AJRNO AJZVZ AKMHD AKQUC ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMYQR AOSHJ ARMRJ AXYYD AZFZN AZQEC B-. B0M BA0 BBNVY BDATZ BENPR BEZIV BGNMA BHPHI BPHCQ BVXVI CAG CCPQU COF CS3 CSCUP D-I D1J DL5 DU5 DWQXO EAD EAP EAS EBC EBD EBLON EBS EDH EIOEI EJD EMB EMK EMOBN EN4 EPAXT EPL ESBYG EST ESTFP ESX F5P FBQ FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC FYUFA G-Y G-Z GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_DOAJ GUQSH GXS HCIFZ HF~ HG5 HG6 HMCUK HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- ITM IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX K60 K6~ KDC KOV KOW KPH KSI LAS LK8 M0C M0L M1P M2O M2P M4Y M7P MA- ML0 MM. N2Q NB0 NDZJH NQJWS NU0 O9- O93 O9G O9I O9J OAM OJZSN P19 PF0 PQBIZ PQQKQ PROAC PSQYO PT5 Q2X QOK QOR QOS R89 R9I RHV RNI RNS ROL ROX RPM RPX RRX RSV RZK S16 S1Z S26 S27 S28 S3A S3B SAP SBL SBY SCLPG SDH SDM SHX SISQX SNE SNX SOJ SPISZ SSXJD STPWE SV3 SZN T13 T16 TOX TSG TSK TSV TUC TUS U2A U9L UG4 UKHRP UNUBA UOJIU UTJUX UZXMN VC2 VFIZW VH1 W23 W48 WJK WK6 WK8 XFK YLTOR Z45 Z5O Z7R Z7U Z7V Z7W Z7X Z7Y Z7Z Z83 Z86 Z87 Z88 Z8M Z8O Z8P Z8Q Z8R Z8S Z8T Z8W Z91 ZOVNA ~8M ~EX ~KM AAHBH AAYZH ABEJV ABGNP ABMOR ABQSL ABXVV ACUHS AEUYN AFBBN ALIPV AMNDL H13 IAO IEP IHR PQBZA ZMTXR AAYXX ABFSG ACSTC AEZWR AFHIU AGQPQ AHWEU AIXLP CITATION PHGZM PHGZT ADHKG CGR CUY CVF ECM EIF NPM 7QL 7QR 7T7 7XB 8FD 8FK C1K FR3 K9. L.- M7N MBDVC P64 PJZUB PKEHL PPXIY PQEST PQGLB PQUKI PRINS Q9U 7X8 7QO 7S9 L.6
ID	FETCH-LOGICAL-c645t-952e8b3c11ff95b8ea56005cd96c3c169c389228c015061ff048c2117689464d3
IEDL.DBID	7X7
ISSN	1367-5435 1476-5535
IngestDate	Fri Jul 11 05:19:59 EDT 2025 Tue Aug 05 11:05:57 EDT 2025 Thu Aug 07 14:44:00 EDT 2025 Wed Aug 13 10:40:03 EDT 2025 Thu Apr 03 06:52:45 EDT 2025 Tue Jul 01 04:30:35 EDT 2025 Thu Apr 24 23:04:02 EDT 2025 Fri Feb 21 02:42:34 EST 2025 Tue Feb 04 08:58:36 EST 2025 Wed Dec 27 19:27:01 EST 2023
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	2
Keywords	Adenylation domains Functional subtype LSI NRPS Protein tokenization
Language	English
License	This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c645t-952e8b3c11ff95b8ea56005cd96c3c169c389228c015061ff048c2117689464d3
Notes	http://dx.doi.org/10.1007/s10295-013-1322-2 ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
OpenAccessLink	https://academic.oup.com/jimb/article-pdf/41/2/461/36813158/jimb0461.pdf
PMID	24104398
PQID	1490632319
PQPubID	54631
PageCount	7
ParticipantIDs	proquest_miscellaneous_1524151854 proquest_miscellaneous_1496898360 proquest_miscellaneous_1491061447 proquest_journals_1490632319 pubmed_primary_24104398 crossref_primary_10_1007_s10295_013_1322_2 crossref_citationtrail_10_1007_s10295_013_1322_2 springer_journals_10_1007_s10295_013_1322_2 oup_primary_10_1007_s10295-013-1322-2 fao_agris_US201400052674
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2014-02-01
PublicationDateYYYYMMDD	2014-02-01
PublicationDate_xml	– month: 02 year: 2014 text: 2014-02-01 day: 01
PublicationDecade	2010
PublicationPlace	Berlin/Heidelberg
PublicationPlace_xml	– name: Berlin/Heidelberg – name: Germany – name: Fairfax
PublicationSubtitle	Official Journal of the Society for Industrial Microbiology and Biotechnology
PublicationTitle	Journal of industrial microbiology & biotechnology
PublicationTitleAbbrev	J Ind Microbiol Biotechnol
PublicationTitleAlternate	J Ind Microbiol Biotechnol
PublicationYear	2014
Publisher	Springer-Verlag Oxford University Press Springer Berlin Heidelberg
Publisher_xml	– name: Springer-Verlag – name: Oxford University Press – name: Springer Berlin Heidelberg
References	Eddy (CR6) 2008; 4 Deerwester, Dumais, Furnas, Landauer, Harshman (CR4) 1990; 41 Challis, Ravel, Townsend (CR2) 2000; 7 Larkin, Blackshields, Brown, Chenna, McGettigan, McWilliam, Valentin, Wallace, Wilm, Lopez, Thompson, Gibson, Higgins (CR9) 2007; 23 Camacho, Coulouris, Avagyan, Ma, Papadopoulos, Bealer, Madden (CR1) 2009; 10 CR11 Stachelhaus, Mootz, Marahiel (CR14) 1999; 6 CR10 Rausch, Weber, Kohlbacher, Wohlleben, Huson (CR12) 2005; 33 Röttig, Medema, Blin, Weber, Rausch, Kohlbacher (CR13) 2011; 39 Starcevic, Zucko, Simunkovic, Long, Cullum, Hranueli (CR15) 2008; 36 Couto, Ladeira, Santos (CR3) 2007; 6 Goldstein, Zucko, Vujaklija, Krisko, Hranueli, Long, Etchebest, Basrak, Cullum (CR7) 2009; 10 Diminic, Zucko, Trninic Ruzic, Gacesa, Hranueli, Long, Cullum, Starcevic (CR5) 2013; 40 Strieker, Tanovic, Marahiel (CR16) 2010; 20 Hannenhalli, Russell (CR8) 2000; 303 Larkin (2021033103400090300_CR9) 2007; 23 Deerwester (2021033103400090300_CR4) 1990; 41 Hannenhalli (2021033103400090300_CR8) 2000; 303 Camacho (2021033103400090300_CR1) 2009; 10 Eddy (2021033103400090300_CR6) 2008; 4 Couto (2021033103400090300_CR3) 2007; 6 Strieker (2021033103400090300_CR16) 2010; 20 Röttig (2021033103400090300_CR13) 2011; 39 Challis (2021033103400090300_CR2) 2000; 7 Goldstein (2021033103400090300_CR7) 2009; 10 Diminic (2021033103400090300_CR5) 2013; 40 Stachelhaus (2021033103400090300_CR14) 1999; 6 2021033103400090300_CR10 2021033103400090300_CR11 Rausch (2021033103400090300_CR12) 2005; 33 Starcevic (2021033103400090300_CR15) 2008; 36
References_xml	– volume: 4 start-page: e1000069 year: 2008 ident: CR6 article-title: A probabilistic model of local sequence alignment that simplifies statistical significance estimation publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1000069 – volume: 10 start-page: 421 year: 2009 ident: CR1 article-title: BLAST + : architecture and applications publication-title: BMC Bioinf doi: 10.1186/1471-2105-10-421 – volume: 6 start-page: 493 year: 1999 end-page: 505 ident: CR14 article-title: The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases publication-title: Chem Biol doi: 10.1016/S1074-5521(99)80082-9 – volume: 7 start-page: 211 year: 2000 end-page: 224 ident: CR2 article-title: Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains publication-title: Chem Biol doi: 10.1016/S1074-5521(00)00091-0 – ident: CR10 – ident: CR11 – volume: 33 start-page: 5799 year: 2005 end-page: 5808 ident: CR12 article-title: Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs) publication-title: Nucleic Acids Res doi: 10.1093/nar/gki885 – volume: 20 start-page: 234 year: 2010 end-page: 240 ident: CR16 article-title: Nonribosomal peptide synthetases: structures and dynamics publication-title: Curr Opin Struct Biol doi: 10.1016/j.sbi.2010.01.009 – volume: 41 start-page: 391 year: 1990 end-page: 407 ident: CR4 article-title: Indexing by latent semantic analysis publication-title: J Am Soc Inform Sci doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9 – volume: 303 start-page: 61 year: 2000 end-page: 76 ident: CR8 article-title: Analysis and prediction of functional sub-types from protein sequence alignments publication-title: J Mol Biol doi: 10.1006/jmbi.2000.4036 – volume: 6 start-page: 983 year: 2007 end-page: 999 ident: CR3 article-title: Application of latent semantic indexing to evaluate the similarity of sets of sequences without multiple alignments character-by-character publication-title: Genet Mol Res – volume: 39 start-page: W362 year: 2011 end-page: W367 ident: CR13 article-title: NRPSpredictor2–a web server for predicting NRPS adenylation domain specificity publication-title: Nucleic Acids Res (Web Server issue) doi: 10.1093/nar/gkr323 – volume: 40 start-page: 653 year: 2013 end-page: 659 ident: CR5 article-title: Databases of the Thiotemplate Modular Systems ( ) and their in silico recombinants ( - ) publication-title: J Ind Microbiol Biotechnol doi: 10.1007/s10295-013-1252-z – volume: 10 start-page: 335 year: 2009 ident: CR7 article-title: Clustering of protein domains for functional and evolutionary studies publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-10-335 – volume: 23 start-page: 2947 year: 2007 end-page: 2948 ident: CR9 article-title: Clustal W and Clustal X version 2.0 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm404 – volume: 36 start-page: 6882 year: 2008 end-page: 6892 ident: CR15 article-title: ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures publication-title: Nucleic Acids Res doi: 10.1093/nar/gkn685 – volume: 23 start-page: 2947 year: 2007 ident: 2021033103400090300_CR9 article-title: Clustal W and Clustal X version 2.0 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm404 – volume: 20 start-page: 234 year: 2010 ident: 2021033103400090300_CR16 article-title: Nonribosomal peptide synthetases: structures and dynamics publication-title: Curr Opin Struct Biol doi: 10.1016/j.sbi.2010.01.009 – volume: 41 start-page: 391 year: 1990 ident: 2021033103400090300_CR4 article-title: Indexing by latent semantic analysis publication-title: J Am Soc Inform Sci doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9 – volume: 40 start-page: 653 year: 2013 ident: 2021033103400090300_CR5 article-title: Databases of the Thiotemplate Modular Systems (CSDB) and their in silico recombinants (r-CSDB) publication-title: J Ind Microbiol Biotechnol doi: 10.1007/s10295-013-1252-z – volume: 10 start-page: 421 year: 2009 ident: 2021033103400090300_CR1 article-title: BLAST + : architecture and applications publication-title: BMC Bioinf doi: 10.1186/1471-2105-10-421 – volume: 33 start-page: 5799 year: 2005 ident: 2021033103400090300_CR12 article-title: Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs) publication-title: Nucleic Acids Res doi: 10.1093/nar/gki885 – volume: 10 start-page: 335 year: 2009 ident: 2021033103400090300_CR7 article-title: Clustering of protein domains for functional and evolutionary studies publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-10-335 – volume: 6 start-page: 493 year: 1999 ident: 2021033103400090300_CR14 article-title: The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases publication-title: Chem Biol doi: 10.1016/S1074-5521(99)80082-9 – ident: 2021033103400090300_CR10 – ident: 2021033103400090300_CR11 – volume: 4 start-page: e1000069 year: 2008 ident: 2021033103400090300_CR6 article-title: A probabilistic model of local sequence alignment that simplifies statistical significance estimation publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1000069 – volume: 6 start-page: 983 year: 2007 ident: 2021033103400090300_CR3 article-title: Application of latent semantic indexing to evaluate the similarity of sets of sequences without multiple alignments character-by-character publication-title: Genet Mol Res – volume: 39 start-page: W362 year: 2011 ident: 2021033103400090300_CR13 article-title: NRPSpredictor2–a web server for predicting NRPS adenylation domain specificity publication-title: Nucleic Acids Res (Web Server issue) doi: 10.1093/nar/gkr323 – volume: 36 start-page: 6882 year: 2008 ident: 2021033103400090300_CR15 article-title: ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures publication-title: Nucleic Acids Res doi: 10.1093/nar/gkn685 – volume: 7 start-page: 211 year: 2000 ident: 2021033103400090300_CR2 article-title: Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains publication-title: Chem Biol doi: 10.1016/S1074-5521(00)00091-0 – volume: 303 start-page: 61 year: 2000 ident: 2021033103400090300_CR8 article-title: Analysis and prediction of functional sub-types from protein sequence alignments publication-title: J Mol Biol doi: 10.1006/jmbi.2000.4036
SSID	ssj0005772
Score	2.2504594
Snippet	Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into functional... Abstract Successful genome mining is dependent on accurate prediction of protein function from sequence. This often involves dividing protein families into... Issue Title: Special Issue: Microbial Genome Mining Successful genome mining is dependent on accurate prediction of protein function from sequence. This often...
SourceID	proquest pubmed crossref springer oup fao
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	461
SubjectTerms	amino acid sequences Amino Acids Amino Acids - chemistry Analysis Biochemistry Bioinformatics Biomedical and Life Sciences Biosynthesis Biotechnology Catalytic Domain chemistry classification Decomposition Documents Genetic Engineering Genomes Genomics Inorganic Chemistry Kinases Life Sciences ligases Linear algebra metabolism Metabolites methods Microbiology Original Article Peptide Synthases Peptide Synthases - chemistry Peptide Synthases - classification Peptide Synthases - metabolism Peptides prediction Proteins Semantics Sequence Alignment Sequence Analysis, Protein Sequence Analysis, Protein - methods Studies Substrate Specificity Substrates
SummonAdditionalLinks	– databaseName: SpringerLink Journals (ICM) dbid: U2A link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fi9QwEB70RNAH0fPHVU-JoC9KYZtN0-bxEI9DUARduLfSJqks3LXLdu9h_w__YL9J27364xZ8KiTTEDLJ5Btm5gvRG6ZIk9pmsasSEyupbGywDeLEWTMvS62z8MbS5y_6bKE-nafnQx13N2a7jyHJYKknxW6Sq4kTDMQeFOzunRSuO-dxLeTJdV5HFl5sYiqyOAUYGEOZ_xrit8vodl22f5S5TdDmX5HScAGdPqQHA3IUJ72qH9Et3xzS3f4tye0h3Z8wCz6mn1_XHIHhnGbRwTYEDlrBZZWcGgTkLdpalLA52z4XTrj2slw2HTc3bbNeVm2Hlgux4qwXh1-3DaDiBndeJ8rGiVC4JQLLw7Lh74oTtNFZbQWGxFUmOn8JtS2tCIyMmMkTWpx-_P7hLB4eYIitVukmNqn0eTW3SVLXJq1yXzI-Sq0z2qJVGwu4I2VuA08hhGAOLDxKuDBGaeXmT-kAc_ZHJGxloX3pnZo7NfNMSeNmtamchj-j8iSi2aiJwg7s5PxIxkVxzavMyiugvIKVV8iI3u1-WfXUHPuEj6DeovwB01ksvkl2LAPXTaYiegud3zDEZLNEdDzuimI46B08JwOQB5BsInq968YR5bhL2fj2Ksj0jne2VwaLxhU1e2RShlsAWJjys35X7maNHi5yziN6P27TySRvWpXn_yX9gu7xuvU568d0sFlf-ZeAZJvqVTiCvwB6xCfe priority: 102 providerName: Springer Nature
Title	Predicting substrate specificity of adenylation domains of nonribosomal peptide synthetases and other protein properties by latent semantic indexing
URI	https://link.springer.com/article/10.1007/s10295-013-1322-2 https://www.ncbi.nlm.nih.gov/pubmed/24104398 https://www.proquest.com/docview/1490632319 https://www.proquest.com/docview/1491061447 https://www.proquest.com/docview/1496898360 https://www.proquest.com/docview/1524151854
Volume	41
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fb9MwED6xTUjwgGD8WGBMRoIXUETjOj_8hArqmEBME1CpPEWJnaBKW1Ka7qH_B38w3zlJW0DrUyTHiRzf-fxdfPcd0UumSJORiX2bB9pXUhlfQw38wBo9zLIoil2NpS_n0dlEfZqG0-6HW9OFVfY20RlqWxv-R_4WSB67KdCIfjf_5XPVKD5d7Upo7NEBU5exVsfTeBPiEbviTcxK5ofABf2pZps6Jzk3OcCw2B-Tf-1Le2VW_5PxtgU8_zs0dXvR6X2614FIMWql_oBuFdUh3W7LSq4O6e4WyeBD-n2x4MMYDm8WDcyEo6MVnGHJUUIA4aIuRQbzs2rD4oStr7JZ1XBzVVeLWV43aLkUcw6AsXh0VQE1LrH9NSKrrHA5XMIRPswqvs45Vhs385XAK7Griaa4ggRnRjhyRozkEU1Ox98_nPldLQbfRCpc-jqURZIPTRCUpQ7zpMgYKoXG6sigNdIGyEfKxDjKQnSCZTBwLuHNaBUpO3xM-xhzcUTC5AaKIAurhlYNCmansYNS5zaCa6OSwKNBL4nUdETlXC_jMt1QLLPwUggvZeGl0qPX60fmLUvHrs5HEG-a_YQVTSffJPuYjvYmVh69gsxveMWWsnh03GtF2q35Jt1oqEcv1rexWvkIJquK-tr1aX3weGcfTBon1-zoEzLyAtbCkJ-0WrkeNe5wvnPi0ZteTbcGedOsPN39Sc_oDk9UG69-TPvLxXXxHHBsmZ-4NXdCB6OPPz6PcX0_Pr_4itaJHP0B-dUvhA
linkProvider	ProQuest
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB61RQg4ICiPBgoYiV5AERtvngeEEFBt6UNIdKW9uYmdoJXaZNlshfI_-B38Rr5xNrsLqHvrKZLtWI5nPI945huiVwyRJkMduSbzEteXvnYTsIHrGZ300zQMI1tj6fgkHAz9L6NgtEG_u1wYDqvsZKIV1KbS_I_8LSx5aFNYI8n7yQ-Xq0bx7WpXQqNli8O8-QmXrX538An03ZNy__Ppx4E7ryrg6tAPZm4SyDzO-trziiIJsjhPWekH2iShRmuYaOhwKWNtwfcwCDyu4SbBLk_80Dd9zLtJN6B4e-zsRaNoGVIS2WJRjILmBrBDulvUNlVPci60h21g_0_-pQc3i7T6J8NuxdD975LW6r79e3R3brSKDy2X3aeNvNymm20Zy2ab7qyAGj6gX1-nfPnD4dSihliy8LeCMzo5KglGv6gKkULcNW0YnjDVRToua24uq3I6zqoaLediwgE3Bq82JazUGdRtLdLSCJszJizAxLjk54Rjw9GZNQJTQouKOr8Ax4y1sGCQWMlDGl4LlR7RFtac75DQmQbjydz4feP3ckbDMb0iyUwIV8qPPYd6HSWUngOjc32Oc7WEdGbiKRBPMfGUdOj14pVJiwqybvAOyKvS75DaavhNsk9rYXYi36E90PyKKVaYxaHdjivUXMbUankiHHq56IZ04CuftMyrSzum9fmjtWOwaZzMs2ZMwJYebDss-XHLlYtVo4fzq2OH3nRsurLIq3blyfpPekG3BqfHR-ro4OTwKd3mTWtj5Xdpaza9zJ_BFJxlz-35E3R23Qf-DxszZew
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB61qUBwQFAeNRRYJHoBWY03fh4QAtqopRBFQKTeXHvXRpFaO8SpUP4Hv4ZfxzdrOwmg5taTpd31ar0zOw_vzDdELxkiTfoqsHXqRLYrXWVHYAPb0SrqJYnvB6bG0ueBfzRyP556pxv0u82F4bDKViYaQa1Lxf_I92HJQ5vCGon28yYsYnjQfzv5YXMFKb5pbctp1Cxyks1_wn2r3hwfgNZ7UvYPv304spsKA7byXW9mR57MwrSnHCfPIy8Ns4QNAE_pyFdo9SMFfS5lqAwQHwaB3xVcJtjokeu7uod5N2krYK-oQ1vvDwfDL8sAk8CUjmJMNNuDVdLeqdaJe5Izox1sCnuD8i-tuJkn5T_5ditm739XtkYT9u_SncaEFe9qnrtHG1mxTTfqopbzbbq9AnF4n34Np3wVxMHVooKQMmC4gvM7OUYJLoAoc5FA-M3roDyhy4tkXFTcXJTFdJyWFVrOxYTDbzRenRewWWdQvpVICi1MBpkwcBPjgp8TjhRHZzoXmBI6VVTZBfhnrISBhsRKHtDoWuj0kDpYc7ZDQqUKbCgz7fa0280YG0d38yjVPhwrN3Qs6raUiFUDk87VOs7jJcAzEy8G8WImXiwterV4ZVJjhKwbvAPyxsl3yPB49FWyh2tAdwLXoj3Q_IopVpjFot2WK-JG4lTx8nxY9GLRDVnBF0BJkZWXZkz9ByBYOwabxqk9a8Z4bPfB0sOSH9VcuVg1ejjbOrTodcumK4u8alcer_-k53QThz3-dDw4eUK3eM_qwPld6syml9lT2IWz9FlzAAWdXfeZ_wONLmuH
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predicting+substrate+specificity+of+adenylation+domains+of+nonribosomal+peptide+synthetases+and+other+protein+properties+by+latent+semantic+indexing&rft.jtitle=Journal+of+industrial+microbiology+%26+biotechnology&rft.au=Barana%C5%A1i%C4%87%2C+Damir&rft.au=Zucko%2C+Jurica&rft.au=Diminic%2C+Janko&rft.au=Gacesa%2C+Ranko&rft.date=2014-02-01&rft.eissn=1476-5535&rft.volume=41&rft.issue=2&rft.spage=461&rft_id=info:doi/10.1007%2Fs10295-013-1322-2&rft_id=info%3Apmid%2F24104398&rft.externalDocID=24104398
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-5435&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-5435&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-5435&client=summon