SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting

Abstract Motivation Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, mu...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 36; no. 4; pp. 1074 - 1081
Main Authors Yu, Bin, Qiu, Wenying, Chen, Cheng, Ma, Anjun, Jiang, Jing, Zhou, Hongyan, Ma, Qin
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.02.2020
Online AccessGet full text

Cover

Loading…
Abstract Abstract Motivation Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design. Results We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8–12.5% and 3.8–9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases. Availability and implementation The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/. Supplementary information Supplementary data are available at Bioinformatics online.
AbstractList Abstract Motivation Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design. Results We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8–12.5% and 3.8–9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases. Availability and implementation The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/. Supplementary information Supplementary data are available at Bioinformatics online.
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.MOTIVATIONMitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.RESULTSWe proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.AVAILABILITY AND IMPLEMENTATIONThe source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design. We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases. The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/. Supplementary data are available at Bioinformatics online.
Author Jiang, Jing
Qiu, Wenying
Ma, Anjun
Chen, Cheng
Zhou, Hongyan
Yu, Bin
Ma, Qin
Author_xml – sequence: 1
  givenname: Bin
  surname: Yu
  fullname: Yu, Bin
  email: yubin@qust.edu.cn
  organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
– sequence: 2
  givenname: Wenying
  surname: Qiu
  fullname: Qiu, Wenying
  organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
– sequence: 3
  givenname: Cheng
  surname: Chen
  fullname: Chen, Cheng
  organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
– sequence: 4
  givenname: Anjun
  surname: Ma
  fullname: Ma, Anjun
  email: qin.ma@osumc.edu
  organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
– sequence: 5
  givenname: Jing
  surname: Jiang
  fullname: Jiang, Jing
  organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
– sequence: 6
  givenname: Hongyan
  surname: Zhou
  fullname: Zhou, Hongyan
  organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
– sequence: 7
  givenname: Qin
  surname: Ma
  fullname: Ma, Qin
  email: qin.ma@osumc.edu
  organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31603468$$D View this record in MEDLINE/PubMed
BookMark eNqNkUFPHSEUhUmjqb6nP6GGZTejMMC8N-1KTbVNNC7UxB0B5mJpGHgFZqE_oL-7vD410Y1dcRffuYdzzwxthRgAoU-UHFLSsyPtogs2plEVZ_KRLo8Lxj-gXco70rRE9Ft1Zt2i4UvCdtAs51-ECMo5_4h2GO0I491yF_25nvSlK7G5Oz-JMZcveJVgcKa4cF_HWMAFnCc9Vsb8jGFITnnso1HePVbrGLB-wHbKa36cfHErD9iCKlMC_PLDiqkwYLgrCUbA90kNDkLBeu1ZpXto2yqfYf_pnaPbs283p9-bi6vzH6fHF43hHSsNGCsE0ZyZfmmV1oPqF-1Si8GqFhad0kJYonrgpu2p6XpOhDUAnNqemY4KNkefN3trtN8T5CJHlw14rwLEKcuWEUHaupRX9OAJrelhkKvkRpUe5PPtKvB1A5gUc05gpXHlX9aSlPOSErluSr5uSm6aqmrxRv1s8J6ObHRxWv2n5C8ASbXp
CitedBy_id crossref_primary_10_3390_math8020169
crossref_primary_10_1016_j_cherd_2024_04_009
crossref_primary_10_1016_j_asoc_2021_107945
crossref_primary_10_1093_bioinformatics_btaa914
crossref_primary_10_1007_s12539_022_00521_3
crossref_primary_10_3389_fcell_2021_801113
crossref_primary_10_1093_bib_bbad036
crossref_primary_10_1007_s00521_020_04792_z
crossref_primary_10_1093_bib_bbab012
crossref_primary_10_1016_j_bpc_2025_107434
crossref_primary_10_1093_bib_bbac341
crossref_primary_10_1016_j_chemolab_2020_103999
crossref_primary_10_1093_bib_bbaa316
crossref_primary_10_1016_j_cj_2022_01_009
crossref_primary_10_1016_j_compbiomed_2022_106471
crossref_primary_10_1186_s12864_022_08566_w
crossref_primary_10_3389_fbinf_2022_910531
crossref_primary_10_1016_j_measurement_2022_112170
crossref_primary_10_1002_iid3_1037
crossref_primary_10_1016_j_oraloncology_2021_105335
crossref_primary_10_1016_j_ecoenv_2024_117611
crossref_primary_10_1093_bib_bbab486
crossref_primary_10_1093_bib_bbaa275
crossref_primary_10_1016_j_neucom_2023_126509
crossref_primary_10_1016_j_bspc_2021_102630
crossref_primary_10_1093_bib_bbaa304
crossref_primary_10_1016_j_ab_2022_114935
crossref_primary_10_1016_j_chemolab_2020_104216
crossref_primary_10_1038_s41598_022_09484_3
crossref_primary_10_1016_j_chemolab_2020_104175
crossref_primary_10_1016_j_bspc_2023_105909
crossref_primary_10_1039_D2RA05102H
crossref_primary_10_1016_j_future_2022_07_005
crossref_primary_10_1109_TCBB_2021_3085589
crossref_primary_10_3390_en15207512
crossref_primary_10_3389_fmolb_2022_867572
crossref_primary_10_1155_2020_9235920
crossref_primary_10_1136_jim_2021_002278
crossref_primary_10_1126_sciadv_abl7393
crossref_primary_10_1093_bioinformatics_btac432
crossref_primary_10_1155_2021_7764764
crossref_primary_10_1016_j_compbiomed_2023_106935
crossref_primary_10_1038_s41467_022_31245_z
crossref_primary_10_1007_s10989_021_10345_2
crossref_primary_10_1016_j_compbiomed_2021_104676
crossref_primary_10_1016_j_compbiomed_2023_107589
crossref_primary_10_1038_s41598_024_82208_x
crossref_primary_10_1016_j_bbapap_2020_140406
crossref_primary_10_3389_fpubh_2021_793801
crossref_primary_10_1109_ACCESS_2023_3268523
crossref_primary_10_1016_j_knosys_2022_109875
crossref_primary_10_1093_bioinformatics_btac727
crossref_primary_10_1155_2022_6991218
crossref_primary_10_1155_2023_9991095
crossref_primary_10_1099_mgen_0_000483
crossref_primary_10_1016_j_reth_2020_09_001
crossref_primary_10_3389_fbioe_2020_00285
crossref_primary_10_1007_s12652_021_03129_5
crossref_primary_10_1016_j_ab_2020_113903
crossref_primary_10_3390_info16010034
crossref_primary_10_1016_j_gpb_2021_01_001
crossref_primary_10_1093_bib_bbaa202
crossref_primary_10_2174_1389202921666200219125625
crossref_primary_10_1142_S0219720022500056
crossref_primary_10_1038_s41467_025_57974_5
crossref_primary_10_1016_j_chemolab_2022_104495
crossref_primary_10_1016_j_chemolab_2022_104496
crossref_primary_10_1016_j_chemolab_2024_105103
crossref_primary_10_1186_s44342_024_00026_z
crossref_primary_10_31083_j_fbl2812322
crossref_primary_10_1109_JSEN_2023_3278719
crossref_primary_10_1093_bioinformatics_btaa155
crossref_primary_10_1016_j_jmgm_2021_107962
crossref_primary_10_1103_PhysRevMaterials_5_035003
crossref_primary_10_1016_j_csbj_2021_10_023
crossref_primary_10_1007_s11704_022_1563_1
crossref_primary_10_3389_fendo_2022_1076664
crossref_primary_10_1155_2022_4694567
crossref_primary_10_1093_bib_bbad184
crossref_primary_10_1111_clr_14222
crossref_primary_10_1093_bioinformatics_btab811
crossref_primary_10_1186_s12911_023_02238_9
crossref_primary_10_1016_j_asoc_2020_106921
crossref_primary_10_1093_bib_bbab167
crossref_primary_10_1093_bib_bbab288
crossref_primary_10_1016_j_jclepro_2022_131418
crossref_primary_10_3390_ijms21165710
crossref_primary_10_1016_j_chemolab_2021_104428
crossref_primary_10_1089_cmb_2022_0109
crossref_primary_10_1007_s12145_024_01575_1
crossref_primary_10_1186_s12864_021_07941_3
crossref_primary_10_1002_iid3_1221
crossref_primary_10_1007_s12539_021_00488_7
crossref_primary_10_1021_acsomega_0c03972
crossref_primary_10_1093_bib_bbac160
crossref_primary_10_1016_j_heliyon_2023_e21149
crossref_primary_10_1016_j_omtn_2023_04_030
crossref_primary_10_3389_fmicb_2020_580382
crossref_primary_10_3390_e23060656
crossref_primary_10_1016_j_measurement_2021_110638
crossref_primary_10_1049_cje_2021_06_003
crossref_primary_10_1186_s12859_023_05475_x
crossref_primary_10_1016_j_compbiomed_2021_104516
crossref_primary_10_1109_ACCESS_2020_3025553
crossref_primary_10_1007_s11042_022_12606_8
crossref_primary_10_1016_j_knosys_2022_108191
crossref_primary_10_1016_j_chemolab_2020_104148
crossref_primary_10_3389_fgene_2024_1498884
crossref_primary_10_3390_life10120347
crossref_primary_10_1016_j_compbiomed_2023_107145
crossref_primary_10_1007_s00034_021_01889_1
crossref_primary_10_2166_ws_2024_144
crossref_primary_10_3390_en16114489
crossref_primary_10_1016_j_bbe_2021_04_015
crossref_primary_10_1109_ACCESS_2020_2989749
crossref_primary_10_1109_TEM_2024_3422821
crossref_primary_10_1016_j_heliyon_2023_e18263
crossref_primary_10_1038_s41598_022_16571_y
crossref_primary_10_1016_j_compbiomed_2022_105911
crossref_primary_10_1016_j_compbiomed_2020_103899
crossref_primary_10_1088_1742_6596_1827_1_012075
crossref_primary_10_1016_j_bspc_2022_103566
crossref_primary_10_1186_s12902_022_01121_4
crossref_primary_10_1186_s12883_020_01989_6
Cites_doi 10.1016/j.jtbi.2011.10.015
10.1613/jair.953
10.1093/bioinformatics/btu792
10.1016/j.ab.2007.10.012
10.1186/s12864-018-4849-9
10.1049/el.2010.2814
10.1186/1471-2105-6-S4-S12
10.1016/j.jtbi.2017.09.013
10.1126/science.aam9080
10.1016/j.neucom.2014.12.123
10.1016/j.patcog.2017.02.025
10.1007/s10441-013-9181-9
10.1093/nar/gku989
10.1016/j.jtbi.2018.04.026
10.1007/s00726-010-0825-7
10.1021/pr060167c
10.1093/bioinformatics/btx614
10.1016/j.jtbi.2010.10.026
10.1186/s12864-018-4928-y
10.1016/j.jtbi.2016.12.026
10.1016/j.jtbi.2005.08.016
10.1016/j.jtbi.2009.03.028
10.1021/acs.jcim.6b00591
10.1121/1.392182
10.1186/1471-2105-7-518
10.1007/s00726-007-0018-1
10.1093/bioinformatics/btw377
10.2337/diab.45.2.113
10.1093/bioinformatics/btx164
10.1016/j.neucom.2013.08.004
10.3390/molecules21080983
10.1093/bioinformatics/bth466
10.1007/s00726-014-1862-4
10.1007/s00232-015-9868-8
10.1039/C4MB00340C
ContentType Journal Article
Copyright The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Copyright_xml – notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
– notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DBID AAYXX
CITATION
NPM
7X8
DOI 10.1093/bioinformatics/btz734
DatabaseName CrossRef
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
1367-4811
EndPage 1081
ExternalDocumentID 31603468
10_1093_bioinformatics_btz734
10.1093/bioinformatics/btz734
Genre Journal Article
GrantInformation_xml – fundername: Project of Shandong Province Higher Educational Science and Technology Program
  grantid: J17KA159
– fundername: National Nature Science Foundation of China
  grantid: 61863010
– fundername: Key Research and Development Program of Shandong Province of China
  grantid: 2019GGX101001
– fundername: Natural Science Foundation of Shandong Province of China
  grantid: ZR2017MA014; ZR2018MC007
– fundername: National Science Foundation
  grantid: ACI-1548562
  funderid: 10.13039/100000001
– fundername: Scientific Research Fund of Hunan Provincial Key Laboratory of Mathematical Modelling and Analysis in Engineering
  grantid: 2018MMAEZD10
GroupedDBID -~X
.2P
5GY
AAMVS
ABPTD
ACGFS
ADZXQ
ALMA_UNASSIGNED_HOLDINGS
BCRHZ
F5P
HW0
KOP
Q5Y
RD5
ROX
TLC
TN5
TOX
WH7
---
-E4
.DC
.I3
0R~
23N
2WC
4.4
48X
53G
5WA
70D
AAIJN
AAIMJ
AAJKP
AAKPC
AAMDB
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABQLI
ABWST
ABXVV
ABZBJ
ACIWK
ACPRK
ACUFI
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EMOBN
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HZ~
IOX
J21
JXSIZ
KAQDR
KQ8
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
R44
RNS
ROL
RPM
RUSNO
RW1
RXO
SV3
TEORI
TJP
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~91
~KM
ADRIX
AFXEN
M49
NPM
7X8
ID FETCH-LOGICAL-c463t-ecf550b43c98fabbda9728b5dfa2e76ab55f0a9e4c291c69405fcee41f93c6153
IEDL.DBID TOX
ISSN 1367-4803
1367-4811
IngestDate Fri Jul 11 15:56:12 EDT 2025
Wed Feb 19 02:31:29 EST 2025
Thu Apr 24 22:58:37 EDT 2025
Tue Jul 01 02:33:50 EDT 2025
Wed Aug 28 03:17:43 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c463t-ecf550b43c98fabbda9728b5dfa2e76ab55f0a9e4c291c69405fcee41f93c6153
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://academic.oup.com/bioinformatics/article-pdf/36/4/1074/32527816/btz734.pdf
PMID 31603468
PQID 2305029724
PQPubID 23479
PageCount 8
ParticipantIDs proquest_miscellaneous_2305029724
pubmed_primary_31603468
crossref_citationtrail_10_1093_bioinformatics_btz734
crossref_primary_10_1093_bioinformatics_btz734
oup_primary_10_1093_bioinformatics_btz734
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-02-15
PublicationDateYYYYMMDD 2020-02-15
PublicationDate_xml – month: 02
  year: 2020
  text: 2020-02-15
  day: 15
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2020
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Chou (2023020108352089400_btz734-B13) 2005; 21
Chou (2023020108352089400_btz734-B14) 2006; 5
Taherzadeh (2023020108352089400_btz734-B41) 2018; 34
Ding (2023020108352089400_btz734-B15) 2015; 47
Khan (2023020108352089400_btz734-B25) 2017; 435
Du (2023020108352089400_btz734-B16) 2006; 7
Hostettler (2023020108352089400_btz734-B23) 2018; 1
Babajide (2023020108352089400_btz734-B2) 2016; 21
Li (2023020108352089400_btz734-B28) 2017; 67
Wen (2023020108352089400_btz734-B44) 2016; 32
UniProt (2023020108352089400_btz734-B42) 2015; 43
Burbulla (2023020108352089400_btz734-B4) 2017; 357
Xu (2023020108352089400_btz734-B46) 2010; 46
He (2023020108352089400_btz734-B22) 2017; 33
Shen (2023020108352089400_btz734-B36) 2006; 240
Lin (2023020108352089400_btz734-B30) 2014; 123
Jiao (2023020108352089400_btz734-B24) 2017; 416
Bu (2023020108352089400_btz734-B3) 2010; 266
Yu (2023020108352089400_btz734-B47) 2018; 19
Zeng (2023020108352089400_btz734-B49) 2009; 259
Nanni (2023020108352089400_btz734-B33) 2008; 34
Shen (2023020108352089400_btz734-B37) 2008; 373
Wang (2023020108352089400_btz734-B43) 2018
Lin (2023020108352089400_btz734-B31) 2013; 61
Zakeri (2023020108352089400_btz734-B48) 2011; 269
Fariselli (2023020108352089400_btz734-B19) 2005; 6
Kira (2023020108352089400_btz734-B26) 1992
Du (2023020108352089400_btz734-B17) 2013; 2013
Zhao (2023020108352089400_btz734-B50) 2018; 19
Gorman (2023020108352089400_btz734-B21) 1985; 77
Li (2023020108352089400_btz734-B29) 2015; 11
Shi (2023020108352089400_btz734-B39) 2011; 1813
Zou (2023020108352089400_btz734-B51) 2016; 173
Qiu (2023020108352089400_btz734-B34) 2018; 450
Chen (2023020108352089400_btz734-B7) 2018; 9
Mei (2023020108352089400_btz734-B32) 2012; 293
Gerbitz (2023020108352089400_btz734-B20) 1996; 45
Silvério-Machado (2023020108352089400_btz734-B40) 2015; 31
Chen (2023020108352089400_btz734-B6) 2016
Sheridan (2023020108352089400_btz734-B38) 2016; 56
Chen (2023020108352089400_btz734-B8) 2012; 42
Ahmad (2023020108352089400_btz734-B1) 2016; 249
Chawla (2023020108352089400_btz734-B5) 2002; 16
References_xml – volume: 293
  start-page: 121
  year: 2012
  ident: 2023020108352089400_btz734-B32
  article-title: Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2011.10.015
– volume: 16
  start-page: 321
  year: 2002
  ident: 2023020108352089400_btz734-B5
  article-title: SMOTE: synthetic minority over-sampling technique
  publication-title: J. Artif. Intell. Res
  doi: 10.1613/jair.953
– volume: 31
  start-page: 1267
  year: 2015
  ident: 2023020108352089400_btz734-B40
  article-title: Retrieval of Enterobacteriaceae drug targets using singular value decomposition
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu792
– volume: 373
  start-page: 386
  year: 2008
  ident: 2023020108352089400_btz734-B37
  article-title: PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition
  publication-title: Anal. Biochem
  doi: 10.1016/j.ab.2007.10.012
– volume: 1813
  start-page: 424
  year: 2011
  ident: 2023020108352089400_btz734-B39
  article-title: Identify submitochondria and subchloroplast locations with pseudo amino acid composition: approach from the strategy of discrete wavelet transform feature extraction
  publication-title: BBA Mol. Cell Res
– volume: 19
  start-page: 478.
  year: 2018
  ident: 2023020108352089400_btz734-B47
  article-title: Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction
  publication-title: BMC Genomics
  doi: 10.1186/s12864-018-4849-9
– volume: 46
  start-page: 452
  year: 2010
  ident: 2023020108352089400_btz734-B46
  article-title: Producing computationally efficient KPCA-based feature extraction for classification problems
  publication-title: Electr. Lett
  doi: 10.1049/el.2010.2814
– volume: 6
  start-page: S12
  year: 2005
  ident: 2023020108352089400_btz734-B19
  article-title: A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-6-S4-S12
– volume: 435
  start-page: 116
  year: 2017
  ident: 2023020108352089400_btz734-B25
  article-title: Bi-PSSM: position specific scoring matrix based intelligent Computational model for identification of mycobacterial membrane proteins
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2017.09.013
– volume: 357
  start-page: 1255
  year: 2017
  ident: 2023020108352089400_btz734-B4
  article-title: Dopamine oxidation mediates mitochondrial and lysosomal dysfunction in Parkinson's disease
  publication-title: Science
  doi: 10.1126/science.aam9080
– volume: 173
  start-page: 346
  year: 2016
  ident: 2023020108352089400_btz734-B51
  article-title: A novel features ranking metric with application to scalable visual and bioinformatics data classification
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2014.12.123
– volume: 67
  start-page: 410
  year: 2017
  ident: 2023020108352089400_btz734-B28
  article-title: Granular multi-label feature selection based on mutual information
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2017.02.025
– volume: 61
  start-page: 259
  year: 2013
  ident: 2023020108352089400_btz734-B31
  article-title: Using over-represented tetrapeptides to predict protein submitochondria locations
  publication-title: Acta Biotheor
  doi: 10.1007/s10441-013-9181-9
– year: 2018
  ident: 2023020108352089400_btz734-B43
  article-title: Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique
  publication-title: Bioinformatics
– volume: 43
  start-page: 204
  year: 2015
  ident: 2023020108352089400_btz734-B42
  article-title: UniProt: a hub for protein information
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gku989
– volume: 450
  start-page: 86
  year: 2018
  ident: 2023020108352089400_btz734-B34
  article-title: Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou's pseudo-amino acid composition
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2018.04.026
– volume: 42
  start-page: 1309
  year: 2012
  ident: 2023020108352089400_btz734-B8
  article-title: Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet
  publication-title: Amino Acids
  doi: 10.1007/s00726-010-0825-7
– volume: 5
  start-page: 1888
  year: 2006
  ident: 2023020108352089400_btz734-B14
  article-title: Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers
  publication-title: J. Proteome Res
  doi: 10.1021/pr060167c
– volume: 34
  start-page: 477
  year: 2018
  ident: 2023020108352089400_btz734-B41
  article-title: Structure-based prediction of protein-peptide binding regions using Random Forest
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx614
– volume: 269
  start-page: 208
  year: 2011
  ident: 2023020108352089400_btz734-B48
  article-title: Prediction of protein submitochondria locations based on data fusion of various features of sequences
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2010.10.026
– volume: 9
  year: 2018
  ident: 2023020108352089400_btz734-B7
  article-title: EGBMMDA: extreme gradient boosting machine for miRNA-disease association prediction
  publication-title: Cell Death Dis
– volume: 19
  start-page: 574
  year: 2018
  ident: 2023020108352089400_btz734-B50
  article-title: Imbalance learning for the prediction of N6-Methylation sites in mRNAs
  publication-title: BMC Genomics
  doi: 10.1186/s12864-018-4928-y
– volume: 416
  start-page: 81
  year: 2017
  ident: 2023020108352089400_btz734-B24
  article-title: Predicting protein submitochondrial locations by incorporating the positional-specific physicochemical properties into Chou's general pseudo-amino acid compositions
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2016.12.026
– volume: 240
  start-page: 9
  year: 2006
  ident: 2023020108352089400_btz734-B36
  article-title: Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2005.08.016
– volume: 259
  start-page: 366
  year: 2009
  ident: 2023020108352089400_btz734-B49
  article-title: Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach
  publication-title: J. Theor. Biol
  doi: 10.1016/j.jtbi.2009.03.028
– start-page: 129
  year: 1992
  ident: 2023020108352089400_btz734-B26
– volume: 2013
  start-page: 1.
  year: 2013
  ident: 2023020108352089400_btz734-B17
  article-title: SubMito-PSPCP: predicting protein submitochondrial locations by hybridizing positional specific physicochemical properties with pseudoamino acid compositions
  publication-title: Biomed Res. Int
– volume: 56
  start-page: 2353
  year: 2016
  ident: 2023020108352089400_btz734-B38
  article-title: Extreme gradient boosting as a method for quantitative structure-activity relationships
  publication-title: J. Chem. Inf. Model
  doi: 10.1021/acs.jcim.6b00591
– volume: 77
  start-page: 1178
  year: 1985
  ident: 2023020108352089400_btz734-B21
  article-title: The use of multidimensional perceptual models in the selection of sonar echo features
  publication-title: J. Acoust. Soc. Am
  doi: 10.1121/1.392182
– volume: 266
  start-page: 1043
  year: 2010
  ident: 2023020108352089400_btz734-B3
  article-title: Prediction of protein (domain) structural classes based on amino-acid index
  publication-title: FEBS J
– volume: 7
  start-page: 518
  year: 2006
  ident: 2023020108352089400_btz734-B16
  article-title: Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-7-518
– volume: 34
  start-page: 653
  year: 2008
  ident: 2023020108352089400_btz734-B33
  article-title: Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization
  publication-title: Amino Acids
  doi: 10.1007/s00726-007-0018-1
– volume: 32
  start-page: 3107
  year: 2016
  ident: 2023020108352089400_btz734-B44
  article-title: Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw377
– volume: 45
  start-page: 113.
  year: 1996
  ident: 2023020108352089400_btz734-B20
  article-title: Mitochondria and diabetes. Genetic, biochemical, and clinical implications of the cellular energy circuit
  publication-title: Diabetes
  doi: 10.2337/diab.45.2.113
– volume: 33
  start-page: 2296
  year: 2017
  ident: 2023020108352089400_btz734-B22
  article-title: NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx164
– volume: 1
  start-page: 1
  year: 2018
  ident: 2023020108352089400_btz734-B23
  article-title: Decision tree analysis in subarachnoid hemorrhage: prediction of outcome parameters during the course of aneurysmal subarachnoid hemorrhage using decision tree analysis
  publication-title: J. Neurosurg
– volume: 123
  start-page: 424
  year: 2014
  ident: 2023020108352089400_btz734-B30
  article-title: LibD3C: ensemble classifiers with a clustering and dynamic selection strategy
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2013.08.004
– volume: 21
  start-page: 983
  year: 2016
  ident: 2023020108352089400_btz734-B2
  article-title: Bioactive molecule prediction using extreme gradient boosting
  publication-title: Molecules
  doi: 10.3390/molecules21080983
– volume: 21
  start-page: 10
  year: 2005
  ident: 2023020108352089400_btz734-B13
  article-title: Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bth466
– volume: 47
  start-page: 329
  year: 2015
  ident: 2023020108352089400_btz734-B15
  article-title: Identification of mitochondrial proteins of malaria parasite using analysis of variance
  publication-title: Amino Acids
  doi: 10.1007/s00726-014-1862-4
– volume: 249
  start-page: 1
  year: 2016
  ident: 2023020108352089400_btz734-B1
  article-title: Prediction of protein submitochondrial locations by incorporating dipeptide composition into chou’s general pseudo amino acid composition
  publication-title: J. Membr. Biol
  doi: 10.1007/s00232-015-9868-8
– year: 2016
  ident: 2023020108352089400_btz734-B6
– volume: 11
  start-page: 170
  year: 2015
  ident: 2023020108352089400_btz734-B29
  article-title: Protein submitochondrial localization from integrated sequence representation and SVM-based backward feature extraction
  publication-title: Mol. Biosyst
  doi: 10.1039/C4MB00340C
SSID ssj0051444
ssj0005056
Score 2.6331358
Snippet Abstract Motivation Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in...
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical...
SourceID proquest
pubmed
crossref
oup
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1074
Title SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting
URI https://www.ncbi.nlm.nih.gov/pubmed/31603468
https://www.proquest.com/docview/2305029724
Volume 36
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEA6LIHgR32-J4MVDtdu8Wm8q6iKsXnaht9KkybIgrezWw_oD_N3O9LGyiqjHwmRCO0nmS2fmG0JOjdE8U6rrheAbPEDg3At1yDwhZKaY4EpVVe_9R9kb8odYxB3it7UwX0P4EbvQ46IhEUXi4gtdvimGBKDgiJEsf_AUf-Z0-MgMUz8AEuB1S1tk9g591tbv_KRywTMtVLt9A52V87lbI6sNaqRXtZnXScfmG2S57iM52yTvsP37sDW9-P66KKblJX2ZYAAGU5ppxcQwzukUNIMMHHd5hquOVm6sKcOkekYd5sCPaJtiSJ2tSD_p_B1ALM0zauMSfyrS0aRKFyupxjlh6BYZ3t0Obnpe02DBM1yy0rPGwQVFc2ai0KVaZ2mkglCLzKWBVTLVQjg_jSw3QdQ1MgJw58Cp8q6LmEGkuE2W8iK3u4Q6oYzKukIbaXkUWC25dIYp0Gh849s9wtuPm5iGfRybYDwndRScJYs2SWqb7JHz-bCXmn7jtwFnYLm_yp609k1gU2GkJM1t8TpN4F4msKtXADI7teHnKhk25uYy3P_HTAdkJcB7OjaSEYdkqZy82iMAM6U-rhbwBwV6-sM
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SubMito-XGBoost%3A+predicting+protein+submitochondrial+localization+by+fusing+multiple+feature+information+and+eXtreme+gradient+boosting&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Yu%2C+Bin&rft.au=Qiu%2C+Wenying&rft.au=Chen%2C+Cheng&rft.au=Ma%2C+Anjun&rft.date=2020-02-15&rft.eissn=1367-4811&rft.volume=36&rft.issue=4&rft.spage=1074&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtz734&rft_id=info%3Apmid%2F31603468&rft.externalDocID=31603468
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon