Mining the Structural Genomics Pipeline: Identification of Protein Properties that Affect High-throughput Experimental Analysis

Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able...

Full description

Saved in:
Bibliographic Details
Published inJournal of molecular biology Vol. 336; no. 1; pp. 115 - 130
Main Authors Goh, Chern-Sing, Lan, Ning, Douglas, Shawn M, Wu, Baolin, Echols, Nathaniel, Smith, Andrew, Milburn, Duncan, Montelione, Gaetano T, Zhao, Hongyu, Gerstein, Mark
Format Journal Article
LanguageEnglish
Published England Elsevier Ltd 06.02.2004
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able to discover correlations between a protein's characteristics and its progress through each stage of the structural genomics pipeline, from cloning, expression, purification, and ultimately to structural determination. First, we use tree-based analyses (decision trees and random forest algorithms) to discover the most significant protein features that influence a protein's amenability to high-throughput experimentation. Based on this, we identify potential bottlenecks in various stages of the structural genomics process through specialized “pipeline schematics”. We find that the properties of a protein that are most significant are: (i) whether it is conserved across many organisms; (ii) the percentage composition of charged residues; (iii) the occurrence of hydrophobic patches; (iv) the number of binding partners it has; and (v) its length. Conversely, a number of other properties that might have been thought to be important, such as nuclear localization signals, are not significant. Thus, using our tree-based analyses, we are able to identify combinations of features that best differentiate the small group of proteins for which a structure has been determined from all the currently selected targets. This information may prove useful in optimizing high-throughput experimentation. Further information is available from http://mining.nesg.org/.
AbstractList Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able to discover correlations between a protein's characteristics and its progress through each stage of the structural genomics pipeline, from cloning, expression, purification, and ultimately to structural determination. First, we use tree-based analyses (decision trees and random forest algorithms) to discover the most significant protein features that influence a protein's amenability to high-throughput experimentation. Based on this, we identify potential bottlenecks in various stages of the structural genomics process through specialized "pipeline schematics". We find that the properties of a protein that are most significant are: (i.) whether it is conserved across many organisms; (ii). the percentage composition of charged residues; (iii). the occurrence of hydrophobic patches; (iv). the number of binding partners it has; and (v). its length. Conversely, a number of other properties that might have been thought to be important, such as nuclear localization signals, are not significant. Thus, using our tree-based analyses, we are able to identify combinations of features that best differentiate the small group of proteins for which a structure has been determined from all the currently selected targets. This information may prove useful in optimizing high-throughput experimentation. Further information is available from http://mining.nesg.org/.
Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able to discover correlations between a protein's characteristics and its progress through each stage of the structural genomics pipeline, from cloning, expression, purification, and ultimately to structural determination. First, we use tree-based analyses (decision trees and random forest algorithms) to discover the most significant protein features that influence a protein's amenability to high-throughput experimentation. Based on this, we identify potential bottlenecks in various stages of the structural genomics process through specialized 'pipeline schematics'. We find that the properties of a protein that are most significant are: (i) whether it is conserved across many organisms; (ii) the percentage composition of charged residues; (iii) the occurrence of hydrophobic patches; (iv) the number of binding partners it has; and (v) its length. Conversely, a number of other properties that might have been thought to be important, such as nuclear localization signals, are not significant. Thus, using our tree-based analyses, we are able to identify combinations of features that best differentiate the small group of proteins for which a structure has been determined from all the currently selected targets. This information may prove useful in optimizing high-throughput experimentation. Further information is available from
Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able to discover correlations between a protein's characteristics and its progress through each stage of the structural genomics pipeline, from cloning, expression, purification, and ultimately to structural determination. First, we use tree-based analyses (decision trees and random forest algorithms) to discover the most significant protein features that influence a protein's amenability to high-throughput experimentation. Based on this, we identify potential bottlenecks in various stages of the structural genomics process through specialized “pipeline schematics”. We find that the properties of a protein that are most significant are: (i) whether it is conserved across many organisms; (ii) the percentage composition of charged residues; (iii) the occurrence of hydrophobic patches; (iv) the number of binding partners it has; and (v) its length. Conversely, a number of other properties that might have been thought to be important, such as nuclear localization signals, are not significant. Thus, using our tree-based analyses, we are able to identify combinations of features that best differentiate the small group of proteins for which a structure has been determined from all the currently selected targets. This information may prove useful in optimizing high-throughput experimentation. Further information is available from http://mining.nesg.org/.
Author Wu, Baolin
Goh, Chern-Sing
Smith, Andrew
Zhao, Hongyu
Douglas, Shawn M
Montelione, Gaetano T
Lan, Ning
Echols, Nathaniel
Gerstein, Mark
Milburn, Duncan
Author_xml – sequence: 1
  givenname: Chern-Sing
  surname: Goh
  fullname: Goh, Chern-Sing
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 2
  givenname: Ning
  surname: Lan
  fullname: Lan, Ning
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 3
  givenname: Shawn M
  surname: Douglas
  fullname: Douglas, Shawn M
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 4
  givenname: Baolin
  surname: Wu
  fullname: Wu, Baolin
  organization: Department of Epidemiology and Public Health, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 5
  givenname: Nathaniel
  surname: Echols
  fullname: Echols, Nathaniel
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 6
  givenname: Andrew
  surname: Smith
  fullname: Smith, Andrew
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 7
  givenname: Duncan
  surname: Milburn
  fullname: Milburn, Duncan
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 8
  givenname: Gaetano T
  surname: Montelione
  fullname: Montelione, Gaetano T
  organization: Northeast Structural Genomics Consortium, Robert Wood Johnson Medical School, UMDNJ, Piscataway, NJ 08854, USA
– sequence: 9
  givenname: Hongyu
  surname: Zhao
  fullname: Zhao, Hongyu
  organization: Department of Epidemiology and Public Health, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
– sequence: 10
  givenname: Mark
  surname: Gerstein
  fullname: Gerstein, Mark
  email: mark.gerstein@yale.edu
  organization: Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Ave, New Haven, CT 06520, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/14741208$$D View this record in MEDLINE/PubMed
BookMark eNp9kU1v1DAURS1URKeFH8AGeQWrBNtxYg-sRlW_pCIqAWvLcZ5nPErsYDuIrvjr9WhGYtfV25x7LN97gc588IDQe0pqSmj3eV_vp75mhDQ1pTVpm1doRYlcV7Jr5BlaEcJYxWTTnaOLlPaEFITLN-iccsEpI3KF_n1z3vktzjvAP3JcTF6iHvEt-DA5k_Cjm2F0Hr7g-wF8dtYZnV3wOFj8GEMG5w93hpgdpKLRGW-sBZPxndvuqryLYdnu5iXj67-FclOxlAc2Xo9PyaW36LXVY4J3p3uJft1c_7y6qx6-395fbR4q00iaK7rugDWkBSG4EFpwRqxhsoOBEwF9r1nX8zUTlnLS23VHubVDa2lrNJNEdM0l-nj0zjH8XiBlNblkYBy1h7AkJQmlssQK-OlFULRcdi2TrJD0SJoYUopg1Vy-p-OTokQd9lF7VfZRh30UpaqUXzIfTvaln2D4nzgNUoCvRwBKGX8cRJWMA29gcLGUqobgXtA_Ax2mo6w
CitedBy_id crossref_primary_10_1016_j_jsb_2010_07_011
crossref_primary_10_1146_annurev_biophys_050708_133740
crossref_primary_10_1016_j_molbiopara_2006_03_011
crossref_primary_10_1007_s10969_008_9039_6
crossref_primary_10_1093_bib_bbs034
crossref_primary_10_1016_j_jmb_2008_03_020
crossref_primary_10_1093_bioinformatics_btn055
crossref_primary_10_1080_0889311X_2014_973868
crossref_primary_10_1093_bioinformatics_btaa791
crossref_primary_10_1093_bioinformatics_btz294
crossref_primary_10_1002_prot_21605
crossref_primary_10_1016_j_molimm_2018_04_006
crossref_primary_10_1371_journal_pone_0105902
crossref_primary_10_1016_j_compchemeng_2019_106533
crossref_primary_10_3934_bdia_2016_1_31
crossref_primary_10_1039_b805710a
crossref_primary_10_1110_ps_073037907
crossref_primary_10_1038_nsmb0404_296
crossref_primary_10_1039_D2ME00150K
crossref_primary_10_1016_j_jmb_2008_11_021
crossref_primary_10_1039_c3mb70033j
crossref_primary_10_1186_1471_2105_15_134
crossref_primary_10_1007_s10969_008_9042_y
crossref_primary_10_3390_ijms23074024
crossref_primary_10_1016_j_str_2004_12_010
crossref_primary_10_1093_bioinformatics_btp386
crossref_primary_10_1016_j_jmb_2005_03_037
crossref_primary_10_1016_j_jmb_2006_10_004
crossref_primary_10_4161_bioe_23003
crossref_primary_10_1038_nrmicro978
crossref_primary_10_1096_fj_09_139527
crossref_primary_10_1093_bioinformatics_btm477
crossref_primary_10_1016_j_copbio_2006_07_004
crossref_primary_10_1107_S1399004713032070
crossref_primary_10_1107_S1399004714019427
crossref_primary_10_4155_pbp_14_23
crossref_primary_10_1371_journal_pone_0072368
crossref_primary_10_1016_j_febslet_2006_06_015
crossref_primary_10_1002_ejlt_201000360
crossref_primary_10_1093_bioinformatics_btl623
crossref_primary_10_1007_s12257_018_0143_6
crossref_primary_10_1016_j_ab_2022_115020
crossref_primary_10_1016_j_biochi_2004_09_013
crossref_primary_10_1042_BST0380908
crossref_primary_10_1016_j_tibtech_2006_02_007
crossref_primary_10_1016_j_bej_2023_109188
crossref_primary_10_1007_s10930_022_10074_5
crossref_primary_10_1038_nbt1044
crossref_primary_10_1021_mp500759p
crossref_primary_10_1186_s12859_017_1995_z
crossref_primary_10_3389_fbioe_2021_630551
crossref_primary_10_1074_jbc_M112_366351
crossref_primary_10_1002_prot_22914
crossref_primary_10_1007_s10969_007_9032_5
crossref_primary_10_1016_j_compbiomed_2015_09_015
crossref_primary_10_1038_nbt_1514
crossref_primary_10_1016_j_ymeth_2011_08_014
crossref_primary_10_1016_j_jmb_2004_09_076
crossref_primary_10_1038_nmeth0208_129
crossref_primary_10_1016_j_bcp_2005_12_024
crossref_primary_10_1186_1471_2105_11_S1_S21
crossref_primary_10_1016_j_eng_2024_01_028
crossref_primary_10_1529_biophysj_106_094045
crossref_primary_10_1002_prot_21191
crossref_primary_10_1016_j_ces_2014_09_044
crossref_primary_10_3389_fmicb_2014_00295
crossref_primary_10_1110_ps_041009005
crossref_primary_10_1007_s10969_012_9124_8
crossref_primary_10_1002_cfg_354
crossref_primary_10_1016_j_ejphar_2020_172912
crossref_primary_10_1002_cam4_62
crossref_primary_10_1016_j_sbi_2006_05_003
crossref_primary_10_1093_bioinformatics_btx207
crossref_primary_10_1111_j_1742_4658_2012_08603_x
crossref_primary_10_1016_j_rse_2005_10_014
crossref_primary_10_1007_s10969_005_9003_7
crossref_primary_10_1016_j_pep_2006_06_024
crossref_primary_10_1016_j_jmb_2017_12_010
crossref_primary_10_1002_prot_20789
crossref_primary_10_1097_01_md_0000189818_63141_8c
crossref_primary_10_1371_journal_pone_0064893
crossref_primary_10_1002_0471140864_ps0524s61
crossref_primary_10_1002_cbic_200900144
crossref_primary_10_1097_01_md_0000217525_82287_eb
crossref_primary_10_1093_jb_mvr042
crossref_primary_10_1021_bi300653m
crossref_primary_10_1186_2042_5783_1_6
crossref_primary_10_2174_0929866529666220715101357
crossref_primary_10_1002_jmr_2980
crossref_primary_10_1186_1472_6807_8_2
crossref_primary_10_1016_j_sbi_2004_04_005
crossref_primary_10_3389_fgene_2021_709514
crossref_primary_10_1007_s10989_019_09901_8
crossref_primary_10_1093_bioinformatics_btr229
crossref_primary_10_1007_s10969_005_2897_2
crossref_primary_10_1093_bioinformatics_bti810
Cites_doi 10.1038/nbt732
10.1016/S1088-467X(97)00008-5
10.1038/88529
10.1093/nar/28.1.289
10.1126/science.298.5595.948
10.1016/0955-0674(90)90100-S
10.1093/nar/27.1.44
10.1016/S1367-5931(02)00019-4
10.1126/science.287.5450.116
10.1016/S0020-7373(87)80053-6
10.1006/jmbi.1999.3110
10.1093/bioinformatics/16.5.465
10.1038/80747
10.1093/nar/gkg056
10.1021/bi00429a001
10.1038/nbt0901-805
10.1126/science.287.5460.1954
10.1038/35001009
10.1073/pnas.061034498
10.1023/A:1010933404324
10.1016/S0959-440X(02)00289-0
10.1016/S0076-6879(96)66035-2
10.1126/science.278.5338.631
10.1093/nar/29.1.242
10.1110/ps.9.1.197
10.1093/bib/3.3.265
10.1146/annurev.bb.15.060186.001541
10.1038/35093574
10.1038/80700
10.1093/nar/28.1.37
10.1093/nar/30.1.303
10.1101/gr.1774904
10.1038/80776
10.1093/nar/25.1.28
10.1038/35048107
10.1126/science.1064987
10.1093/nar/30.1.31
10.1093/nar/gkg397
10.1110/ps.4570102
10.1093/nar/26.1.33
10.1126/science.1332192
10.1093/nar/29.13.2884
10.1002/1615-9861(200210)2:10<1435::AID-PROT1435>3.0.CO;2-9
10.1093/nar/29.1.239
10.1002/prot.10282
10.1038/415141a
ContentType Journal Article
Copyright 2003 Elsevier Ltd
Copyright_xml – notice: 2003 Elsevier Ltd
DBID CGR
CUY
CVF
ECM
EIF
NPM
AAYXX
CITATION
8FD
FR3
P64
RC3
7X8
DOI 10.1016/j.jmb.2003.11.053
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
CrossRef
Technology Research Database
Engineering Research Database
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
CrossRef
Genetics Abstracts
Engineering Research Database
Technology Research Database
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitleList MEDLINE
Genetics Abstracts

MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
Biology
EISSN 1089-8638
EndPage 130
ExternalDocumentID 10_1016_j_jmb_2003_11_053
14741208
S0022283603014748
Genre Research Support, U.S. Gov't, Non-P.H.S
Research Support, U.S. Gov't, P.H.S
Journal Article
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: 5P50GM062413-03
GroupedDBID ---
--K
--M
-DZ
-ET
-~X
.55
.GJ
.~1
0R~
186
1B1
1RT
1~.
1~5
29L
3O-
4.4
457
4G.
53G
5GY
5RE
5VS
7-5
71M
85S
8P~
9JM
AAAJQ
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARKO
AAXUO
ABEFU
ABFNM
ABFRF
ABGSF
ABJNI
ABLJU
ABMAC
ABOCM
ABPPZ
ABUDA
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACKIV
ACNCT
ACRLP
ADBBV
ADEZE
ADFGL
ADIYS
ADMUD
ADUVX
AEBSH
AEFWE
AEHWI
AEKER
AENEX
AFFNX
AFKWA
AFMIJ
AFTJW
AFXIZ
AGEKW
AGHFR
AGRDE
AGUBO
AGYEJ
AHHHB
AHPSJ
AI.
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CAG
CJTIS
COF
CS3
DM4
DOVZS
DU5
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GX1
HLW
HMG
HVGLF
HX~
HZ~
H~9
IH2
IHE
J1W
K-O
KOM
LG5
LUGTX
LX2
LZ5
M41
MO0
MVM
N9A
NEJ
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBG
SDF
SDG
SDP
SES
SEW
SIN
SPCBC
SSI
SSU
SSZ
T5K
TWZ
UQL
VH1
VQA
WH7
WUQ
X7M
XJT
XOL
XPP
Y6R
YQT
YYP
ZGI
ZKB
ZMT
ZU3
~G-
~KM
AKRWK
CGR
CUY
CVF
ECM
EIF
NPM
0SF
AAHBH
AAXKI
AAYXX
ADVLN
CITATION
8FD
FR3
P64
RC3
7X8
ID FETCH-LOGICAL-c381t-196e2305e77477a7420fc286ed407ebba26b4927f140bf9614ffd5f15ca280763
IEDL.DBID AIKHN
ISSN 0022-2836
IngestDate Thu Aug 15 22:30:12 EDT 2024
Fri Aug 16 07:11:30 EDT 2024
Thu Sep 12 18:16:18 EDT 2024
Thu May 23 23:53:24 EDT 2024
Fri Feb 23 02:14:28 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords structural genomics
hydrophobicity
oob, out-of-bag
COGs
charged residues
decision trees
NLS, nuclear localization signal
COG, clusters of orthologous groups
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c381t-196e2305e77477a7420fc286ed407ebba26b4927f140bf9614ffd5f15ca280763
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 14741208
PQID 754865282
PQPubID 23462
PageCount 16
ParticipantIDs proquest_miscellaneous_80118614
proquest_miscellaneous_754865282
crossref_primary_10_1016_j_jmb_2003_11_053
pubmed_primary_14741208
elsevier_sciencedirect_doi_10_1016_j_jmb_2003_11_053
PublicationCentury 2000
PublicationDate 2004-02-06
PublicationDateYYYYMMDD 2004-02-06
PublicationDate_xml – month: 02
  year: 2004
  text: 2004-02-06
  day: 06
PublicationDecade 2000
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Journal of molecular biology
PublicationTitleAlternate J Mol Biol
PublicationYear 2004
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Chance, Bresnick, Burley, Jiang, Lima, Sali (BIB9) 2002; 11
Goh, Lan, Echols, Douglas, Milburn, Bertone (BIB11) 2003; 31
Terwilliger (BIB8) 2000; 7
Dash, Liu (BIB16) 1997; 1
Mewes, Frishman, Guldener, Mannhaupt, Mayer, Mokrejs (BIB35) 2002; 30
Uetz, Giot, Cagney, Mansfield, Judson, Knight (BIB41) 2000; 403
Pedelacq, Piltch, Liong, Berendzen, Kim, Rho (BIB7) 2002; 20
Breiman (BIB12) 2001; 45
Breiman (BIB13) 2002
Yu, Luscombe, Zhu, Chung, Goh, Gerstein (BIB34) 2004
Engelman, Steitz, Goldman (BIB19) 1986; 15
Mewes, Heumann, Kaps, Mayer, Pfeiffer, Stocker, Frishman (BIB37) 1999; 27
Quinlan (BIB14) 1987; 27
Brenner, Levitt (BIB2) 2000; 9
Yokoyama (BIB22) 2003; 7
Sanchez, Pieper, Melo, Eswar, Marti-Renom, Madhusudhan (BIB3) 2000; 7
Wootton, Federhen (BIB29) 1996; 266
Mewes, Frishman, Gruber, Geier, Haase, Kaps (BIB36) 2000; 28
Gavin, Bosche, Krause, Grandi, Marzioch, Bauer (BIB46) 2002; 415
Gierasch (BIB24) 1989; 28
Mewes, Hani, Pfeiffer, Frishman (BIB38) 1998; 26
Savchenko, Yee, Khachatryan, Skarina, Evdokimova, Pavlova (BIB18) 2003; 50
Brenner (BIB4) 2000; 7
Bader, Donaldson, Wolting, Ouellette, Pawson, Hogue (BIB49) 2001; 29
Walhout, Sordella, Lu, Hartley, Temple, Brasch (BIB32) 2000; 287
von Heijne (BIB25) 1990; 2
Tong, Drees, Nardelli, Bader, Brannetti, Castagnoli (BIB40) 2002; 295
Sigrist, Cerutti, Hulo, Gattiker, Falquet, Pagni (BIB31) 2002; 3
Gattiker, Bienvenut, Bairoch, Gasteiger (BIB30) 2002; 2
Dyson, Wright (BIB21) 2002; 12
Wright, Dyson (BIB20) 1999; 293
Sali (BIB27) 2001; 8
Ito, Chiba, Ozawa, Yoshida, Hattori, Sakaki (BIB42) 2001; 98
Bertone, Kluger, Lan, Zheng, Christendat, Yee (BIB10) 2001; 29
Dunker, Obradovic (BIB23) 2001; 19
Tatusov, Koonin, Lipman (BIB17) 1997; 278
Mewes, Albermann, Heumann, Liebl, Pfeiffer (BIB39) 1997; 25
Quinlan (BIB15) 1993
Xenarios, Rice, Salwinski, Baron, Marcotte, Eisenberg (BIB44) 2000; 28
Rapoport (BIB26) 1992; 258
Bader, Betel, Hogue (BIB47) 2003; 31
Brenner (BIB5) 2001; 2
Service (BIB6) 2002; 298
Bader, Hogue (BIB48) 2000; 16
Xenarios, Salwinski, Duan, Higney, Kim, Eisenberg (BIB43) 2002; 30
Xenarios, Fernandez, Salwinski, Duan, Thompson, Marcotte, Eisenberg (BIB45) 2001; 29
Walhout, Vidal (BIB33) 2001; 2
Service (BIB1) 2000; 287
Ihaka, Gentleman (BIB28) 1996; 5
Sali (10.1016/j.jmb.2003.11.053_BIB27) 2001; 8
Gattiker (10.1016/j.jmb.2003.11.053_BIB30) 2002; 2
Mewes (10.1016/j.jmb.2003.11.053_BIB38) 1998; 26
Pedelacq (10.1016/j.jmb.2003.11.053_BIB7) 2002; 20
Wootton (10.1016/j.jmb.2003.11.053_BIB29) 1996; 266
Ito (10.1016/j.jmb.2003.11.053_BIB42) 2001; 98
Tong (10.1016/j.jmb.2003.11.053_BIB40) 2002; 295
Bertone (10.1016/j.jmb.2003.11.053_BIB10) 2001; 29
Terwilliger (10.1016/j.jmb.2003.11.053_BIB8) 2000; 7
Rapoport (10.1016/j.jmb.2003.11.053_BIB26) 1992; 258
Goh (10.1016/j.jmb.2003.11.053_BIB11) 2003; 31
Ihaka (10.1016/j.jmb.2003.11.053_BIB28) 1996; 5
Brenner (10.1016/j.jmb.2003.11.053_BIB4) 2000; 7
Dunker (10.1016/j.jmb.2003.11.053_BIB23) 2001; 19
Tatusov (10.1016/j.jmb.2003.11.053_BIB17) 1997; 278
Bader (10.1016/j.jmb.2003.11.053_BIB49) 2001; 29
Xenarios (10.1016/j.jmb.2003.11.053_BIB44) 2000; 28
Sigrist (10.1016/j.jmb.2003.11.053_BIB31) 2002; 3
Breiman (10.1016/j.jmb.2003.11.053_BIB12) 2001; 45
Engelman (10.1016/j.jmb.2003.11.053_BIB19) 1986; 15
Mewes (10.1016/j.jmb.2003.11.053_BIB39) 1997; 25
Brenner (10.1016/j.jmb.2003.11.053_BIB2) 2000; 9
Mewes (10.1016/j.jmb.2003.11.053_BIB36) 2000; 28
Chance (10.1016/j.jmb.2003.11.053_BIB9) 2002; 11
Dyson (10.1016/j.jmb.2003.11.053_BIB21) 2002; 12
Savchenko (10.1016/j.jmb.2003.11.053_BIB18) 2003; 50
Dash (10.1016/j.jmb.2003.11.053_BIB16) 1997; 1
Uetz (10.1016/j.jmb.2003.11.053_BIB41) 2000; 403
Breiman (10.1016/j.jmb.2003.11.053_BIB13) 2002
Walhout (10.1016/j.jmb.2003.11.053_BIB32) 2000; 287
Walhout (10.1016/j.jmb.2003.11.053_BIB33) 2001; 2
Bader (10.1016/j.jmb.2003.11.053_BIB48) 2000; 16
Wright (10.1016/j.jmb.2003.11.053_BIB20) 1999; 293
Bader (10.1016/j.jmb.2003.11.053_BIB47) 2003; 31
Xenarios (10.1016/j.jmb.2003.11.053_BIB43) 2002; 30
Mewes (10.1016/j.jmb.2003.11.053_BIB35) 2002; 30
Gavin (10.1016/j.jmb.2003.11.053_BIB46) 2002; 415
von Heijne (10.1016/j.jmb.2003.11.053_BIB25) 1990; 2
Service (10.1016/j.jmb.2003.11.053_BIB1) 2000; 287
Sanchez (10.1016/j.jmb.2003.11.053_BIB3) 2000; 7
Brenner (10.1016/j.jmb.2003.11.053_BIB5) 2001; 2
Mewes (10.1016/j.jmb.2003.11.053_BIB37) 1999; 27
Quinlan (10.1016/j.jmb.2003.11.053_BIB15) 1993
Gierasch (10.1016/j.jmb.2003.11.053_BIB24) 1989; 28
Yokoyama (10.1016/j.jmb.2003.11.053_BIB22) 2003; 7
Xenarios (10.1016/j.jmb.2003.11.053_BIB45) 2001; 29
Quinlan (10.1016/j.jmb.2003.11.053_BIB14) 1987; 27
Service (10.1016/j.jmb.2003.11.053_BIB6) 2002; 298
Yu (10.1016/j.jmb.2003.11.053_BIB34) 2004
References_xml – volume: 403
  start-page: 623
  year: 2000
  end-page: 627
  ident: BIB41
  article-title: A comprehensive analysis of protein-protein interactions in
  publication-title: Nature
  contributor:
    fullname: Knight
– volume: 415
  start-page: 141
  year: 2002
  end-page: 147
  ident: BIB46
  article-title: Functional organization of the yeast proteome by systematic analysis of protein complexes
  publication-title: Nature
  contributor:
    fullname: Bauer
– volume: 50
  start-page: 392
  year: 2003
  end-page: 399
  ident: BIB18
  article-title: Strategies for structural proteomics of prokaryotes: quantifying the advantages of studying orthologous proteins and of using both NMR and X-ray crystallography approaches
  publication-title: Proteins: Struct. Funct. Genet.
  contributor:
    fullname: Pavlova
– year: 1993
  ident: BIB15
  publication-title: C4.5: Programs for Machine Learning
  contributor:
    fullname: Quinlan
– volume: 8
  start-page: 482
  year: 2001
  end-page: 484
  ident: BIB27
  article-title: Target practice
  publication-title: Nature Struct. Biol.
  contributor:
    fullname: Sali
– volume: 19
  start-page: 805
  year: 2001
  end-page: 806
  ident: BIB23
  article-title: The protein trinity—linking function and disorder
  publication-title: Nature Biotechnol.
  contributor:
    fullname: Obradovic
– volume: 28
  start-page: 289
  year: 2000
  end-page: 291
  ident: BIB44
  article-title: DIP: the database of interacting proteins
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Eisenberg
– volume: 29
  start-page: 239
  year: 2001
  end-page: 241
  ident: BIB45
  article-title: DIP: The Database of Interacting Proteins: update
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Eisenberg
– volume: 7
  start-page: 935
  year: 2000
  end-page: 939
  ident: BIB8
  article-title: Structural genomics in North America
  publication-title: Nature Struct. Biol.
  contributor:
    fullname: Terwilliger
– volume: 45
  start-page: 5
  year: 2001
  end-page: 32
  ident: BIB12
  article-title: Random forests
  publication-title: Machine Learn.
  contributor:
    fullname: Breiman
– volume: 30
  start-page: 31
  year: 2002
  end-page: 34
  ident: BIB35
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Mokrejs
– volume: 16
  start-page: 465
  year: 2000
  end-page: 477
  ident: BIB48
  article-title: BIND—a data specification for storing and describing biomolecular interactions, molecular complexes and pathways
  publication-title: Bioinformatics
  contributor:
    fullname: Hogue
– volume: 258
  start-page: 931
  year: 1992
  end-page: 936
  ident: BIB26
  article-title: Transport of proteins across the endoplasmic reticulum membrane
  publication-title: Science
  contributor:
    fullname: Rapoport
– volume: 15
  start-page: 321
  year: 1986
  end-page: 353
  ident: BIB19
  article-title: Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins
  publication-title: Annu. Rev. Biophys. Biophys. Chem.
  contributor:
    fullname: Goldman
– volume: 7
  start-page: 39
  year: 2003
  end-page: 43
  ident: BIB22
  article-title: Protein expression systems for structural genomics and proteomics
  publication-title: Curr. Opin. Chem. Biol.
  contributor:
    fullname: Yokoyama
– volume: 11
  start-page: 723
  year: 2002
  end-page: 738
  ident: BIB9
  article-title: Structural genomics: a pipeline for providing structures for the biologist
  publication-title: Protein Sci.
  contributor:
    fullname: Sali
– volume: 3
  start-page: 265
  year: 2002
  end-page: 274
  ident: BIB31
  article-title: PROSITE: a documented database using patterns and profiles as motif descriptors
  publication-title: Brief Bioinform.
  contributor:
    fullname: Pagni
– volume: 29
  start-page: 2884
  year: 2001
  end-page: 2898
  ident: BIB10
  article-title: SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Yee
– volume: 29
  start-page: 242
  year: 2001
  end-page: 245
  ident: BIB49
  article-title: BIND—the biomolecular interaction network database
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Hogue
– volume: 2
  start-page: 55
  year: 2001
  end-page: 62
  ident: BIB33
  article-title: Protein interaction maps for model organisms
  publication-title: Nature Rev. Mol. Cell Biol.
  contributor:
    fullname: Vidal
– year: 2004
  ident: BIB34
  article-title: Annotation transfer for genomics: assessing the transferability of protein–protein and protein–DNA interactions between organisms
  publication-title: Genome Res.
  contributor:
    fullname: Gerstein
– volume: 298
  start-page: 948
  year: 2002
  end-page: 950
  ident: BIB6
  article-title: Structural genomics. Tapping DNA for structures produces a trickle
  publication-title: Science
  contributor:
    fullname: Service
– volume: 295
  start-page: 321
  year: 2002
  end-page: 324
  ident: BIB40
  article-title: A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules
  publication-title: Science
  contributor:
    fullname: Castagnoli
– volume: 98
  start-page: 4569
  year: 2001
  end-page: 4574
  ident: BIB42
  article-title: A comprehensive two-hybrid analysis to explore the yeast protein interactome
  publication-title: Proc. Natl Acad. Sci. USA
  contributor:
    fullname: Sakaki
– volume: 28
  start-page: 37
  year: 2000
  end-page: 40
  ident: BIB36
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Kaps
– volume: 30
  start-page: 303
  year: 2002
  end-page: 305
  ident: BIB43
  article-title: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Eisenberg
– volume: 287
  start-page: 1954
  year: 2000
  end-page: 1956
  ident: BIB1
  article-title: Structural genomics offers high-speed look at proteins
  publication-title: Science
  contributor:
    fullname: Service
– volume: 7
  start-page: 967
  year: 2000
  end-page: 969
  ident: BIB4
  article-title: Target selection for structural genomics
  publication-title: Nature Struct. Biol.
  contributor:
    fullname: Brenner
– volume: 2
  start-page: 801
  year: 2001
  end-page: 809
  ident: BIB5
  article-title: A tour of structural genomics
  publication-title: Nature Rev. Genet.
  contributor:
    fullname: Brenner
– volume: 20
  start-page: 927
  year: 2002
  end-page: 932
  ident: BIB7
  article-title: Engineering soluble proteins for structural genomics
  publication-title: Nature Biotechnol.
  contributor:
    fullname: Rho
– volume: 25
  start-page: 28
  year: 1997
  end-page: 30
  ident: BIB39
  article-title: MIPS: a database for protein sequences, homology data and yeast genome information
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Pfeiffer
– volume: 278
  start-page: 631
  year: 1997
  end-page: 637
  ident: BIB17
  article-title: A genomic perspective on protein families
  publication-title: Science
  contributor:
    fullname: Lipman
– volume: 28
  start-page: 923
  year: 1989
  end-page: 930
  ident: BIB24
  article-title: Signal sequences
  publication-title: Biochemistry
  contributor:
    fullname: Gierasch
– volume: 1
  start-page: 131
  year: 1997
  end-page: 156
  ident: BIB16
  article-title: Feature selection for classification
  publication-title: Intelligent Data Anal.
  contributor:
    fullname: Liu
– volume: 2
  start-page: 604
  year: 1990
  end-page: 608
  ident: BIB25
  article-title: Protein targeting signals
  publication-title: Curr. Opin. Cell Biol.
  contributor:
    fullname: von Heijne
– volume: 2
  start-page: 1435
  year: 2002
  end-page: 1444
  ident: BIB30
  article-title: FindPept, a tool to identify unmatched masses in peptide mass fingerprinting protein identification
  publication-title: Proteomics
  contributor:
    fullname: Gasteiger
– volume: 26
  start-page: 33
  year: 1998
  end-page: 37
  ident: BIB38
  article-title: MIPS: a database for protein sequences and complete genomes
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Frishman
– volume: 27
  start-page: 221
  year: 1987
  end-page: 234
  ident: BIB14
  article-title: Simplifying decision trees
  publication-title: Int. J. Man-Machine Stud.
  contributor:
    fullname: Quinlan
– volume: 287
  start-page: 116
  year: 2000
  end-page: 122
  ident: BIB32
  article-title: Protein interaction mapping in
  publication-title: Science
  contributor:
    fullname: Brasch
– volume: 9
  start-page: 197
  year: 2000
  end-page: 200
  ident: BIB2
  article-title: Expectations from structural genomics
  publication-title: Protein Sci.
  contributor:
    fullname: Levitt
– year: 2002
  ident: BIB13
  publication-title: IMS Wald Lecture 2
  contributor:
    fullname: Breiman
– volume: 293
  start-page: 321
  year: 1999
  end-page: 331
  ident: BIB20
  article-title: Intrinsically unstructured proteins: re-assessing the protein structure–function paradigm
  publication-title: J. Mol. Biol.
  contributor:
    fullname: Dyson
– volume: 27
  start-page: 44
  year: 1999
  end-page: 48
  ident: BIB37
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Frishman
– volume: 31
  start-page: 248
  year: 2003
  end-page: 250
  ident: BIB47
  article-title: BIND: the Biomolecular Interaction Network Database
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Hogue
– volume: 7
  start-page: 986
  year: 2000
  end-page: 990
  ident: BIB3
  article-title: Protein structure modeling for structural genomics
  publication-title: Nature Struct. Biol.
  contributor:
    fullname: Madhusudhan
– volume: 5
  start-page: 299
  year: 1996
  end-page: 314
  ident: BIB28
  article-title: R: a language for data analysis and graphics
  publication-title: J. Comput. Graph. Stat.
  contributor:
    fullname: Gentleman
– volume: 266
  start-page: 554
  year: 1996
  end-page: 571
  ident: BIB29
  article-title: Analysis of compositionally biased regions in sequence databases
  publication-title: Methods Enzymol.
  contributor:
    fullname: Federhen
– volume: 31
  start-page: 2833
  year: 2003
  end-page: 2838
  ident: BIB11
  article-title: SPINE 2: a system for collaborative structural proteomics within a federated database framework
  publication-title: Nucl. Acids Res.
  contributor:
    fullname: Bertone
– volume: 12
  start-page: 54
  year: 2002
  end-page: 60
  ident: BIB21
  article-title: Coupling of folding and binding for unstructured proteins
  publication-title: Curr. Opin. Struct. Biol.
  contributor:
    fullname: Wright
– volume: 20
  start-page: 927
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB7
  article-title: Engineering soluble proteins for structural genomics
  publication-title: Nature Biotechnol.
  doi: 10.1038/nbt732
  contributor:
    fullname: Pedelacq
– volume: 1
  start-page: 131
  year: 1997
  ident: 10.1016/j.jmb.2003.11.053_BIB16
  article-title: Feature selection for classification
  publication-title: Intelligent Data Anal.
  doi: 10.1016/S1088-467X(97)00008-5
  contributor:
    fullname: Dash
– volume: 8
  start-page: 482
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB27
  article-title: Target practice
  publication-title: Nature Struct. Biol.
  doi: 10.1038/88529
  contributor:
    fullname: Sali
– volume: 28
  start-page: 289
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB44
  article-title: DIP: the database of interacting proteins
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/28.1.289
  contributor:
    fullname: Xenarios
– volume: 298
  start-page: 948
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB6
  article-title: Structural genomics. Tapping DNA for structures produces a trickle
  publication-title: Science
  doi: 10.1126/science.298.5595.948
  contributor:
    fullname: Service
– volume: 2
  start-page: 604
  year: 1990
  ident: 10.1016/j.jmb.2003.11.053_BIB25
  article-title: Protein targeting signals
  publication-title: Curr. Opin. Cell Biol.
  doi: 10.1016/0955-0674(90)90100-S
  contributor:
    fullname: von Heijne
– volume: 27
  start-page: 44
  year: 1999
  ident: 10.1016/j.jmb.2003.11.053_BIB37
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/27.1.44
  contributor:
    fullname: Mewes
– volume: 7
  start-page: 39
  year: 2003
  ident: 10.1016/j.jmb.2003.11.053_BIB22
  article-title: Protein expression systems for structural genomics and proteomics
  publication-title: Curr. Opin. Chem. Biol.
  doi: 10.1016/S1367-5931(02)00019-4
  contributor:
    fullname: Yokoyama
– volume: 287
  start-page: 116
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB32
  article-title: Protein interaction mapping in C.elegans using proteins involved in vulval development
  publication-title: Science
  doi: 10.1126/science.287.5450.116
  contributor:
    fullname: Walhout
– volume: 27
  start-page: 221
  year: 1987
  ident: 10.1016/j.jmb.2003.11.053_BIB14
  article-title: Simplifying decision trees
  publication-title: Int. J. Man-Machine Stud.
  doi: 10.1016/S0020-7373(87)80053-6
  contributor:
    fullname: Quinlan
– volume: 5
  start-page: 299
  year: 1996
  ident: 10.1016/j.jmb.2003.11.053_BIB28
  article-title: R: a language for data analysis and graphics
  publication-title: J. Comput. Graph. Stat.
  contributor:
    fullname: Ihaka
– volume: 293
  start-page: 321
  year: 1999
  ident: 10.1016/j.jmb.2003.11.053_BIB20
  article-title: Intrinsically unstructured proteins: re-assessing the protein structure–function paradigm
  publication-title: J. Mol. Biol.
  doi: 10.1006/jmbi.1999.3110
  contributor:
    fullname: Wright
– volume: 16
  start-page: 465
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB48
  article-title: BIND—a data specification for storing and describing biomolecular interactions, molecular complexes and pathways
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/16.5.465
  contributor:
    fullname: Bader
– volume: 7
  start-page: 967
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB4
  article-title: Target selection for structural genomics
  publication-title: Nature Struct. Biol.
  doi: 10.1038/80747
  contributor:
    fullname: Brenner
– volume: 31
  start-page: 248
  year: 2003
  ident: 10.1016/j.jmb.2003.11.053_BIB47
  article-title: BIND: the Biomolecular Interaction Network Database
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/gkg056
  contributor:
    fullname: Bader
– volume: 28
  start-page: 923
  year: 1989
  ident: 10.1016/j.jmb.2003.11.053_BIB24
  article-title: Signal sequences
  publication-title: Biochemistry
  doi: 10.1021/bi00429a001
  contributor:
    fullname: Gierasch
– volume: 19
  start-page: 805
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB23
  article-title: The protein trinity—linking function and disorder
  publication-title: Nature Biotechnol.
  doi: 10.1038/nbt0901-805
  contributor:
    fullname: Dunker
– volume: 287
  start-page: 1954
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB1
  article-title: Structural genomics offers high-speed look at proteins
  publication-title: Science
  doi: 10.1126/science.287.5460.1954
  contributor:
    fullname: Service
– volume: 403
  start-page: 623
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB41
  article-title: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae
  publication-title: Nature
  doi: 10.1038/35001009
  contributor:
    fullname: Uetz
– volume: 98
  start-page: 4569
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB42
  article-title: A comprehensive two-hybrid analysis to explore the yeast protein interactome
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.061034498
  contributor:
    fullname: Ito
– volume: 45
  start-page: 5
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB12
  article-title: Random forests
  publication-title: Machine Learn.
  doi: 10.1023/A:1010933404324
  contributor:
    fullname: Breiman
– volume: 12
  start-page: 54
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB21
  article-title: Coupling of folding and binding for unstructured proteins
  publication-title: Curr. Opin. Struct. Biol.
  doi: 10.1016/S0959-440X(02)00289-0
  contributor:
    fullname: Dyson
– volume: 266
  start-page: 554
  year: 1996
  ident: 10.1016/j.jmb.2003.11.053_BIB29
  article-title: Analysis of compositionally biased regions in sequence databases
  publication-title: Methods Enzymol.
  doi: 10.1016/S0076-6879(96)66035-2
  contributor:
    fullname: Wootton
– volume: 278
  start-page: 631
  year: 1997
  ident: 10.1016/j.jmb.2003.11.053_BIB17
  article-title: A genomic perspective on protein families
  publication-title: Science
  doi: 10.1126/science.278.5338.631
  contributor:
    fullname: Tatusov
– volume: 29
  start-page: 242
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB49
  article-title: BIND—the biomolecular interaction network database
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/29.1.242
  contributor:
    fullname: Bader
– volume: 9
  start-page: 197
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB2
  article-title: Expectations from structural genomics
  publication-title: Protein Sci.
  doi: 10.1110/ps.9.1.197
  contributor:
    fullname: Brenner
– volume: 3
  start-page: 265
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB31
  article-title: PROSITE: a documented database using patterns and profiles as motif descriptors
  publication-title: Brief Bioinform.
  doi: 10.1093/bib/3.3.265
  contributor:
    fullname: Sigrist
– volume: 15
  start-page: 321
  year: 1986
  ident: 10.1016/j.jmb.2003.11.053_BIB19
  article-title: Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins
  publication-title: Annu. Rev. Biophys. Biophys. Chem.
  doi: 10.1146/annurev.bb.15.060186.001541
  contributor:
    fullname: Engelman
– volume: 2
  start-page: 801
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB5
  article-title: A tour of structural genomics
  publication-title: Nature Rev. Genet.
  doi: 10.1038/35093574
  contributor:
    fullname: Brenner
– volume: 7
  start-page: 935
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB8
  article-title: Structural genomics in North America
  publication-title: Nature Struct. Biol.
  doi: 10.1038/80700
  contributor:
    fullname: Terwilliger
– volume: 28
  start-page: 37
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB36
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/28.1.37
  contributor:
    fullname: Mewes
– volume: 30
  start-page: 303
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB43
  article-title: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/30.1.303
  contributor:
    fullname: Xenarios
– year: 1993
  ident: 10.1016/j.jmb.2003.11.053_BIB15
  contributor:
    fullname: Quinlan
– year: 2004
  ident: 10.1016/j.jmb.2003.11.053_BIB34
  article-title: Annotation transfer for genomics: assessing the transferability of protein–protein and protein–DNA interactions between organisms
  publication-title: Genome Res.
  doi: 10.1101/gr.1774904
  contributor:
    fullname: Yu
– volume: 7
  start-page: 986
  year: 2000
  ident: 10.1016/j.jmb.2003.11.053_BIB3
  article-title: Protein structure modeling for structural genomics
  publication-title: Nature Struct. Biol.
  doi: 10.1038/80776
  contributor:
    fullname: Sanchez
– volume: 25
  start-page: 28
  year: 1997
  ident: 10.1016/j.jmb.2003.11.053_BIB39
  article-title: MIPS: a database for protein sequences, homology data and yeast genome information
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/25.1.28
  contributor:
    fullname: Mewes
– volume: 2
  start-page: 55
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB33
  article-title: Protein interaction maps for model organisms
  publication-title: Nature Rev. Mol. Cell Biol.
  doi: 10.1038/35048107
  contributor:
    fullname: Walhout
– volume: 295
  start-page: 321
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB40
  article-title: A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules
  publication-title: Science
  doi: 10.1126/science.1064987
  contributor:
    fullname: Tong
– volume: 30
  start-page: 31
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB35
  article-title: MIPS: a database for genomes and protein sequences
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/30.1.31
  contributor:
    fullname: Mewes
– volume: 31
  start-page: 2833
  year: 2003
  ident: 10.1016/j.jmb.2003.11.053_BIB11
  article-title: SPINE 2: a system for collaborative structural proteomics within a federated database framework
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/gkg397
  contributor:
    fullname: Goh
– volume: 11
  start-page: 723
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB9
  article-title: Structural genomics: a pipeline for providing structures for the biologist
  publication-title: Protein Sci.
  doi: 10.1110/ps.4570102
  contributor:
    fullname: Chance
– volume: 26
  start-page: 33
  year: 1998
  ident: 10.1016/j.jmb.2003.11.053_BIB38
  article-title: MIPS: a database for protein sequences and complete genomes
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/26.1.33
  contributor:
    fullname: Mewes
– volume: 258
  start-page: 931
  year: 1992
  ident: 10.1016/j.jmb.2003.11.053_BIB26
  article-title: Transport of proteins across the endoplasmic reticulum membrane
  publication-title: Science
  doi: 10.1126/science.1332192
  contributor:
    fullname: Rapoport
– volume: 29
  start-page: 2884
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB10
  article-title: SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/29.13.2884
  contributor:
    fullname: Bertone
– volume: 2
  start-page: 1435
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB30
  article-title: FindPept, a tool to identify unmatched masses in peptide mass fingerprinting protein identification
  publication-title: Proteomics
  doi: 10.1002/1615-9861(200210)2:10<1435::AID-PROT1435>3.0.CO;2-9
  contributor:
    fullname: Gattiker
– year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB13
  contributor:
    fullname: Breiman
– volume: 29
  start-page: 239
  year: 2001
  ident: 10.1016/j.jmb.2003.11.053_BIB45
  article-title: DIP: The Database of Interacting Proteins: update
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/29.1.239
  contributor:
    fullname: Xenarios
– volume: 50
  start-page: 392
  year: 2003
  ident: 10.1016/j.jmb.2003.11.053_BIB18
  article-title: Strategies for structural proteomics of prokaryotes: quantifying the advantages of studying orthologous proteins and of using both NMR and X-ray crystallography approaches
  publication-title: Proteins: Struct. Funct. Genet.
  doi: 10.1002/prot.10282
  contributor:
    fullname: Savchenko
– volume: 415
  start-page: 141
  year: 2002
  ident: 10.1016/j.jmb.2003.11.053_BIB46
  article-title: Functional organization of the yeast proteome by systematic analysis of protein complexes
  publication-title: Nature
  doi: 10.1038/415141a
  contributor:
    fullname: Gavin
SSID ssj0005348
Score 2.1993022
Snippet Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first...
SourceID proquest
crossref
pubmed
elsevier
SourceType Aggregation Database
Index Database
Publisher
StartPage 115
SubjectTerms Algorithms
charged residues
COGs
Computational Biology
Databases, Protein
Decision Trees
Genomics
hydrophobicity
Protein Conformation
Protein Sorting Signals
Proteins - chemistry
Proteins - genetics
Sequence Analysis, Protein
structural genomics
Title Mining the Structural Genomics Pipeline: Identification of Protein Properties that Affect High-throughput Experimental Analysis
URI https://dx.doi.org/10.1016/j.jmb.2003.11.053
https://www.ncbi.nlm.nih.gov/pubmed/14741208
https://search.proquest.com/docview/754865282
https://search.proquest.com/docview/80118614
Volume 336
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB50RfQivl2fOXgSorttXvW2LOqqKIIK3krTJrALdgu7HrzoX3eSNqiH9eCp0DYlzSTzSL75BuBY5prLPBKU4WPKCqNolnUtzdGZVx0tishXLbm7F4NndvPCX-agH3JhHKyy0f21Tvfaurlz1ozmWTUcuhxft3sRCx8VSKbmYQHNEWMtWOhd3w7uv5EeMVOBNNw1CIebHuY1etWeFfTUcXnyeJZ5muV-ejN0uQorjf9IenUX12DOlOuwWFeUfF-HpX4o4LYBn3e--ANBF488eppYR7FBrozPRJ6Qh2HlktHNOamzdW2zfUfGljw4-oZh6a6VQ16bCX4mm5Keh38Qhw6hTYmf6m1KLn7UCSCB52QTni8vnvoD2tRboDna7SnFxWgwIuEGXUIpMwyaOzaPlDAFRn1G6ywSmiWRtChHbRM07NYW3HZ5njlOHRFvQascl2YHiMGFbhJpVVQ4kjGhpOHKaBFrXmSJSNpwEoY5rWpajTTgzUYpysSVx4wxPElRJm1gQRDpr7mRotr_qxkJQktx5N1BSFaa8dsklRimCY7BZhuOZryiXEYu_mEbtmtxf3cT51o36qjd__VqD5Zr8E9EO2IfWih-c4B-zVQfwvzpR_ewmb1fpg33Rw
link.rule.ids 315,786,790,4521,24144,27957,27958,45620,45714
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NT9wwEB1RKkQvFaWUboHiA6dKht0k_gg3tAK2LYuQAImbFSe2tCuRjbTLgQv8dWacWKWH5cApUuJEjmc8nrFn3gM4UKUVqkwkz_AxzyqneVEMPC_Rmdd9K6sksJaML-XoNvtzJ-5WYBhrYSitsrP9rU0P1rq7c9SN5lEzmVCNL-1epDJEBSrTH-AjufPE33D49CrPI810hAyn5vFoMyR5Te9twAQ9JCRPkS5bnJY5n2EROtuAz533yE7aDn6BFVdvwlrLJ_m4CevDSN_2FZ7HgfqBoYPHrgNILAFssHMX6pDn7GrSUCm6O2Ztra7vNu_YzLMrAm-Y1HRtKO_azfEzxYKdhOQPRrkhvCP4aR4W7PQVSwCLKCdbcHt2ejMc8Y5tgZe4ai84TkWH8Yhw6BAqVWDI3PdloqWrMOZz1haJtFmeKI9StD7HZd37SviBKAtC1JHpN1itZ7X7DszhNHe58jqpCGJMauWEdlamVlRFLvMe_IrDbJoWVMPEbLOpQZkQOWaKwYlBmfQgi4Iw_2mGQaP_1mssCs3gyNMxSFG72cPcKAzSpMBQswf7S5poqsfFP-zBdivuf91ETRskff3jfb3ah_XRzfjCXPy-_LsDn9o0oIT35S6soiq4PfRwFvZn0OAXxcn4FQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Mining+the+Structural+Genomics+Pipeline%3A+Identification+of+Protein+Properties+that+Affect+High-throughput+Experimental+Analysis&rft.jtitle=Journal+of+molecular+biology&rft.au=Goh%2C+Chern-Sing&rft.au=Lan%2C+Ning&rft.au=Douglas%2C+Shawn+M&rft.au=Wu%2C+Baolin&rft.date=2004-02-06&rft.issn=0022-2836&rft.volume=336&rft.issue=1&rft.spage=115&rft.epage=130&rft_id=info:doi/10.1016%2Fj.jmb.2003.11.053&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jmb_2003_11_053
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0022-2836&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0022-2836&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0022-2836&client=summon