SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting
Abstract Motivation Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, mu...
Saved in:
Published in | Bioinformatics Vol. 36; no. 4; pp. 1074 - 1081 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
15.02.2020
|
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract
Motivation
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.
Results
We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8–12.5% and 3.8–9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.
Availability and implementation
The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
AbstractList | Abstract
Motivation
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.
Results
We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8–12.5% and 3.8–9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.
Availability and implementation
The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.
Supplementary information
Supplementary data are available at Bioinformatics online. Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.MOTIVATIONMitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.RESULTSWe proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.AVAILABILITY AND IMPLEMENTATIONThe source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online. Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design. We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g-gap dipeptide composition (g-gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8-12.5% and 3.8-9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases. The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/. Supplementary data are available at Bioinformatics online. |
Author | Jiang, Jing Qiu, Wenying Ma, Anjun Chen, Cheng Zhou, Hongyan Yu, Bin Ma, Qin |
Author_xml | – sequence: 1 givenname: Bin surname: Yu fullname: Yu, Bin email: yubin@qust.edu.cn organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China – sequence: 2 givenname: Wenying surname: Qiu fullname: Qiu, Wenying organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China – sequence: 3 givenname: Cheng surname: Chen fullname: Chen, Cheng organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China – sequence: 4 givenname: Anjun surname: Ma fullname: Ma, Anjun email: qin.ma@osumc.edu organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA – sequence: 5 givenname: Jing surname: Jiang fullname: Jiang, Jing organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA – sequence: 6 givenname: Hongyan surname: Zhou fullname: Zhou, Hongyan organization: College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China – sequence: 7 givenname: Qin surname: Ma fullname: Ma, Qin email: qin.ma@osumc.edu organization: Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31603468$$D View this record in MEDLINE/PubMed |
BookMark | eNqNkUFPHSEUhUmjqb6nP6GGZTejMMC8N-1KTbVNNC7UxB0B5mJpGHgFZqE_oL-7vD410Y1dcRffuYdzzwxthRgAoU-UHFLSsyPtogs2plEVZ_KRLo8Lxj-gXco70rRE9Ft1Zt2i4UvCdtAs51-ECMo5_4h2GO0I491yF_25nvSlK7G5Oz-JMZcveJVgcKa4cF_HWMAFnCc9Vsb8jGFITnnso1HePVbrGLB-wHbKa36cfHErD9iCKlMC_PLDiqkwYLgrCUbA90kNDkLBeu1ZpXto2yqfYf_pnaPbs283p9-bi6vzH6fHF43hHSsNGCsE0ZyZfmmV1oPqF-1Si8GqFhad0kJYonrgpu2p6XpOhDUAnNqemY4KNkefN3trtN8T5CJHlw14rwLEKcuWEUHaupRX9OAJrelhkKvkRpUe5PPtKvB1A5gUc05gpXHlX9aSlPOSErluSr5uSm6aqmrxRv1s8J6ObHRxWv2n5C8ASbXp |
CitedBy_id | crossref_primary_10_3390_math8020169 crossref_primary_10_1016_j_cherd_2024_04_009 crossref_primary_10_1016_j_asoc_2021_107945 crossref_primary_10_1093_bioinformatics_btaa914 crossref_primary_10_1007_s12539_022_00521_3 crossref_primary_10_3389_fcell_2021_801113 crossref_primary_10_1093_bib_bbad036 crossref_primary_10_1007_s00521_020_04792_z crossref_primary_10_1093_bib_bbab012 crossref_primary_10_1016_j_bpc_2025_107434 crossref_primary_10_1093_bib_bbac341 crossref_primary_10_1016_j_chemolab_2020_103999 crossref_primary_10_1093_bib_bbaa316 crossref_primary_10_1016_j_cj_2022_01_009 crossref_primary_10_1016_j_compbiomed_2022_106471 crossref_primary_10_1186_s12864_022_08566_w crossref_primary_10_3389_fbinf_2022_910531 crossref_primary_10_1016_j_measurement_2022_112170 crossref_primary_10_1002_iid3_1037 crossref_primary_10_1016_j_oraloncology_2021_105335 crossref_primary_10_1016_j_ecoenv_2024_117611 crossref_primary_10_1093_bib_bbab486 crossref_primary_10_1093_bib_bbaa275 crossref_primary_10_1016_j_neucom_2023_126509 crossref_primary_10_1016_j_bspc_2021_102630 crossref_primary_10_1093_bib_bbaa304 crossref_primary_10_1016_j_ab_2022_114935 crossref_primary_10_1016_j_chemolab_2020_104216 crossref_primary_10_1038_s41598_022_09484_3 crossref_primary_10_1016_j_chemolab_2020_104175 crossref_primary_10_1016_j_bspc_2023_105909 crossref_primary_10_1039_D2RA05102H crossref_primary_10_1016_j_future_2022_07_005 crossref_primary_10_1109_TCBB_2021_3085589 crossref_primary_10_3390_en15207512 crossref_primary_10_3389_fmolb_2022_867572 crossref_primary_10_1155_2020_9235920 crossref_primary_10_1136_jim_2021_002278 crossref_primary_10_1126_sciadv_abl7393 crossref_primary_10_1093_bioinformatics_btac432 crossref_primary_10_1155_2021_7764764 crossref_primary_10_1016_j_compbiomed_2023_106935 crossref_primary_10_1038_s41467_022_31245_z crossref_primary_10_1007_s10989_021_10345_2 crossref_primary_10_1016_j_compbiomed_2021_104676 crossref_primary_10_1016_j_compbiomed_2023_107589 crossref_primary_10_1038_s41598_024_82208_x crossref_primary_10_1016_j_bbapap_2020_140406 crossref_primary_10_3389_fpubh_2021_793801 crossref_primary_10_1109_ACCESS_2023_3268523 crossref_primary_10_1016_j_knosys_2022_109875 crossref_primary_10_1093_bioinformatics_btac727 crossref_primary_10_1155_2022_6991218 crossref_primary_10_1155_2023_9991095 crossref_primary_10_1099_mgen_0_000483 crossref_primary_10_1016_j_reth_2020_09_001 crossref_primary_10_3389_fbioe_2020_00285 crossref_primary_10_1007_s12652_021_03129_5 crossref_primary_10_1016_j_ab_2020_113903 crossref_primary_10_3390_info16010034 crossref_primary_10_1016_j_gpb_2021_01_001 crossref_primary_10_1093_bib_bbaa202 crossref_primary_10_2174_1389202921666200219125625 crossref_primary_10_1142_S0219720022500056 crossref_primary_10_1038_s41467_025_57974_5 crossref_primary_10_1016_j_chemolab_2022_104495 crossref_primary_10_1016_j_chemolab_2022_104496 crossref_primary_10_1016_j_chemolab_2024_105103 crossref_primary_10_1186_s44342_024_00026_z crossref_primary_10_31083_j_fbl2812322 crossref_primary_10_1109_JSEN_2023_3278719 crossref_primary_10_1093_bioinformatics_btaa155 crossref_primary_10_1016_j_jmgm_2021_107962 crossref_primary_10_1103_PhysRevMaterials_5_035003 crossref_primary_10_1016_j_csbj_2021_10_023 crossref_primary_10_1007_s11704_022_1563_1 crossref_primary_10_3389_fendo_2022_1076664 crossref_primary_10_1155_2022_4694567 crossref_primary_10_1093_bib_bbad184 crossref_primary_10_1111_clr_14222 crossref_primary_10_1093_bioinformatics_btab811 crossref_primary_10_1186_s12911_023_02238_9 crossref_primary_10_1016_j_asoc_2020_106921 crossref_primary_10_1093_bib_bbab167 crossref_primary_10_1093_bib_bbab288 crossref_primary_10_1016_j_jclepro_2022_131418 crossref_primary_10_3390_ijms21165710 crossref_primary_10_1016_j_chemolab_2021_104428 crossref_primary_10_1089_cmb_2022_0109 crossref_primary_10_1007_s12145_024_01575_1 crossref_primary_10_1186_s12864_021_07941_3 crossref_primary_10_1002_iid3_1221 crossref_primary_10_1007_s12539_021_00488_7 crossref_primary_10_1021_acsomega_0c03972 crossref_primary_10_1093_bib_bbac160 crossref_primary_10_1016_j_heliyon_2023_e21149 crossref_primary_10_1016_j_omtn_2023_04_030 crossref_primary_10_3389_fmicb_2020_580382 crossref_primary_10_3390_e23060656 crossref_primary_10_1016_j_measurement_2021_110638 crossref_primary_10_1049_cje_2021_06_003 crossref_primary_10_1186_s12859_023_05475_x crossref_primary_10_1016_j_compbiomed_2021_104516 crossref_primary_10_1109_ACCESS_2020_3025553 crossref_primary_10_1007_s11042_022_12606_8 crossref_primary_10_1016_j_knosys_2022_108191 crossref_primary_10_1016_j_chemolab_2020_104148 crossref_primary_10_3389_fgene_2024_1498884 crossref_primary_10_3390_life10120347 crossref_primary_10_1016_j_compbiomed_2023_107145 crossref_primary_10_1007_s00034_021_01889_1 crossref_primary_10_2166_ws_2024_144 crossref_primary_10_3390_en16114489 crossref_primary_10_1016_j_bbe_2021_04_015 crossref_primary_10_1109_ACCESS_2020_2989749 crossref_primary_10_1109_TEM_2024_3422821 crossref_primary_10_1016_j_heliyon_2023_e18263 crossref_primary_10_1038_s41598_022_16571_y crossref_primary_10_1016_j_compbiomed_2022_105911 crossref_primary_10_1016_j_compbiomed_2020_103899 crossref_primary_10_1088_1742_6596_1827_1_012075 crossref_primary_10_1016_j_bspc_2022_103566 crossref_primary_10_1186_s12902_022_01121_4 crossref_primary_10_1186_s12883_020_01989_6 |
Cites_doi | 10.1016/j.jtbi.2011.10.015 10.1613/jair.953 10.1093/bioinformatics/btu792 10.1016/j.ab.2007.10.012 10.1186/s12864-018-4849-9 10.1049/el.2010.2814 10.1186/1471-2105-6-S4-S12 10.1016/j.jtbi.2017.09.013 10.1126/science.aam9080 10.1016/j.neucom.2014.12.123 10.1016/j.patcog.2017.02.025 10.1007/s10441-013-9181-9 10.1093/nar/gku989 10.1016/j.jtbi.2018.04.026 10.1007/s00726-010-0825-7 10.1021/pr060167c 10.1093/bioinformatics/btx614 10.1016/j.jtbi.2010.10.026 10.1186/s12864-018-4928-y 10.1016/j.jtbi.2016.12.026 10.1016/j.jtbi.2005.08.016 10.1016/j.jtbi.2009.03.028 10.1021/acs.jcim.6b00591 10.1121/1.392182 10.1186/1471-2105-7-518 10.1007/s00726-007-0018-1 10.1093/bioinformatics/btw377 10.2337/diab.45.2.113 10.1093/bioinformatics/btx164 10.1016/j.neucom.2013.08.004 10.3390/molecules21080983 10.1093/bioinformatics/bth466 10.1007/s00726-014-1862-4 10.1007/s00232-015-9868-8 10.1039/C4MB00340C |
ContentType | Journal Article |
Copyright | The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019 The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
Copyright_xml | – notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019 – notice: The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
DBID | AAYXX CITATION NPM 7X8 |
DOI | 10.1093/bioinformatics/btz734 |
DatabaseName | CrossRef PubMed MEDLINE - Academic |
DatabaseTitle | CrossRef PubMed MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1460-2059 1367-4811 |
EndPage | 1081 |
ExternalDocumentID | 31603468 10_1093_bioinformatics_btz734 10.1093/bioinformatics/btz734 |
Genre | Journal Article |
GrantInformation_xml | – fundername: Project of Shandong Province Higher Educational Science and Technology Program grantid: J17KA159 – fundername: National Nature Science Foundation of China grantid: 61863010 – fundername: Key Research and Development Program of Shandong Province of China grantid: 2019GGX101001 – fundername: Natural Science Foundation of Shandong Province of China grantid: ZR2017MA014; ZR2018MC007 – fundername: National Science Foundation grantid: ACI-1548562 funderid: 10.13039/100000001 – fundername: Scientific Research Fund of Hunan Provincial Key Laboratory of Mathematical Modelling and Analysis in Engineering grantid: 2018MMAEZD10 |
GroupedDBID | -~X .2P 5GY AAMVS ABPTD ACGFS ADZXQ ALMA_UNASSIGNED_HOLDINGS BCRHZ F5P HW0 KOP Q5Y RD5 ROX TLC TN5 TOX WH7 --- -E4 .DC .I3 0R~ 23N 2WC 4.4 48X 53G 5WA 70D AAIJN AAIMJ AAJKP AAKPC AAMDB AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN AAYXX ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABQLI ABWST ABXVV ABZBJ ACIWK ACPRK ACUFI ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C45 CDBKE CITATION CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EMOBN F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HZ~ IOX J21 JXSIZ KAQDR KQ8 KSI KSN M-Z MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NU- O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. R44 RNS ROL RPM RUSNO RW1 RXO SV3 TEORI TJP TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~91 ~KM ADRIX AFXEN M49 NPM 7X8 |
ID | FETCH-LOGICAL-c463t-ecf550b43c98fabbda9728b5dfa2e76ab55f0a9e4c291c69405fcee41f93c6153 |
IEDL.DBID | TOX |
ISSN | 1367-4803 1367-4811 |
IngestDate | Fri Jul 11 15:56:12 EDT 2025 Wed Feb 19 02:31:29 EST 2025 Thu Apr 24 22:58:37 EDT 2025 Tue Jul 01 02:33:50 EDT 2025 Wed Aug 28 03:17:43 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
License | This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c463t-ecf550b43c98fabbda9728b5dfa2e76ab55f0a9e4c291c69405fcee41f93c6153 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://academic.oup.com/bioinformatics/article-pdf/36/4/1074/32527816/btz734.pdf |
PMID | 31603468 |
PQID | 2305029724 |
PQPubID | 23479 |
PageCount | 8 |
ParticipantIDs | proquest_miscellaneous_2305029724 pubmed_primary_31603468 crossref_citationtrail_10_1093_bioinformatics_btz734 crossref_primary_10_1093_bioinformatics_btz734 oup_primary_10_1093_bioinformatics_btz734 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2020-02-15 |
PublicationDateYYYYMMDD | 2020-02-15 |
PublicationDate_xml | – month: 02 year: 2020 text: 2020-02-15 day: 15 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Bioinformatics |
PublicationTitleAlternate | Bioinformatics |
PublicationYear | 2020 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | Chou (2023020108352089400_btz734-B13) 2005; 21 Chou (2023020108352089400_btz734-B14) 2006; 5 Taherzadeh (2023020108352089400_btz734-B41) 2018; 34 Ding (2023020108352089400_btz734-B15) 2015; 47 Khan (2023020108352089400_btz734-B25) 2017; 435 Du (2023020108352089400_btz734-B16) 2006; 7 Hostettler (2023020108352089400_btz734-B23) 2018; 1 Babajide (2023020108352089400_btz734-B2) 2016; 21 Li (2023020108352089400_btz734-B28) 2017; 67 Wen (2023020108352089400_btz734-B44) 2016; 32 UniProt (2023020108352089400_btz734-B42) 2015; 43 Burbulla (2023020108352089400_btz734-B4) 2017; 357 Xu (2023020108352089400_btz734-B46) 2010; 46 He (2023020108352089400_btz734-B22) 2017; 33 Shen (2023020108352089400_btz734-B36) 2006; 240 Lin (2023020108352089400_btz734-B30) 2014; 123 Jiao (2023020108352089400_btz734-B24) 2017; 416 Bu (2023020108352089400_btz734-B3) 2010; 266 Yu (2023020108352089400_btz734-B47) 2018; 19 Zeng (2023020108352089400_btz734-B49) 2009; 259 Nanni (2023020108352089400_btz734-B33) 2008; 34 Shen (2023020108352089400_btz734-B37) 2008; 373 Wang (2023020108352089400_btz734-B43) 2018 Lin (2023020108352089400_btz734-B31) 2013; 61 Zakeri (2023020108352089400_btz734-B48) 2011; 269 Fariselli (2023020108352089400_btz734-B19) 2005; 6 Kira (2023020108352089400_btz734-B26) 1992 Du (2023020108352089400_btz734-B17) 2013; 2013 Zhao (2023020108352089400_btz734-B50) 2018; 19 Gorman (2023020108352089400_btz734-B21) 1985; 77 Li (2023020108352089400_btz734-B29) 2015; 11 Shi (2023020108352089400_btz734-B39) 2011; 1813 Zou (2023020108352089400_btz734-B51) 2016; 173 Qiu (2023020108352089400_btz734-B34) 2018; 450 Chen (2023020108352089400_btz734-B7) 2018; 9 Mei (2023020108352089400_btz734-B32) 2012; 293 Gerbitz (2023020108352089400_btz734-B20) 1996; 45 Silvério-Machado (2023020108352089400_btz734-B40) 2015; 31 Chen (2023020108352089400_btz734-B6) 2016 Sheridan (2023020108352089400_btz734-B38) 2016; 56 Chen (2023020108352089400_btz734-B8) 2012; 42 Ahmad (2023020108352089400_btz734-B1) 2016; 249 Chawla (2023020108352089400_btz734-B5) 2002; 16 |
References_xml | – volume: 293 start-page: 121 year: 2012 ident: 2023020108352089400_btz734-B32 article-title: Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2011.10.015 – volume: 16 start-page: 321 year: 2002 ident: 2023020108352089400_btz734-B5 article-title: SMOTE: synthetic minority over-sampling technique publication-title: J. Artif. Intell. Res doi: 10.1613/jair.953 – volume: 31 start-page: 1267 year: 2015 ident: 2023020108352089400_btz734-B40 article-title: Retrieval of Enterobacteriaceae drug targets using singular value decomposition publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu792 – volume: 373 start-page: 386 year: 2008 ident: 2023020108352089400_btz734-B37 article-title: PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition publication-title: Anal. Biochem doi: 10.1016/j.ab.2007.10.012 – volume: 1813 start-page: 424 year: 2011 ident: 2023020108352089400_btz734-B39 article-title: Identify submitochondria and subchloroplast locations with pseudo amino acid composition: approach from the strategy of discrete wavelet transform feature extraction publication-title: BBA Mol. Cell Res – volume: 19 start-page: 478. year: 2018 ident: 2023020108352089400_btz734-B47 article-title: Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction publication-title: BMC Genomics doi: 10.1186/s12864-018-4849-9 – volume: 46 start-page: 452 year: 2010 ident: 2023020108352089400_btz734-B46 article-title: Producing computationally efficient KPCA-based feature extraction for classification problems publication-title: Electr. Lett doi: 10.1049/el.2010.2814 – volume: 6 start-page: S12 year: 2005 ident: 2023020108352089400_btz734-B19 article-title: A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-6-S4-S12 – volume: 435 start-page: 116 year: 2017 ident: 2023020108352089400_btz734-B25 article-title: Bi-PSSM: position specific scoring matrix based intelligent Computational model for identification of mycobacterial membrane proteins publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2017.09.013 – volume: 357 start-page: 1255 year: 2017 ident: 2023020108352089400_btz734-B4 article-title: Dopamine oxidation mediates mitochondrial and lysosomal dysfunction in Parkinson's disease publication-title: Science doi: 10.1126/science.aam9080 – volume: 173 start-page: 346 year: 2016 ident: 2023020108352089400_btz734-B51 article-title: A novel features ranking metric with application to scalable visual and bioinformatics data classification publication-title: Neurocomputing doi: 10.1016/j.neucom.2014.12.123 – volume: 67 start-page: 410 year: 2017 ident: 2023020108352089400_btz734-B28 article-title: Granular multi-label feature selection based on mutual information publication-title: Pattern Recogn doi: 10.1016/j.patcog.2017.02.025 – volume: 61 start-page: 259 year: 2013 ident: 2023020108352089400_btz734-B31 article-title: Using over-represented tetrapeptides to predict protein submitochondria locations publication-title: Acta Biotheor doi: 10.1007/s10441-013-9181-9 – year: 2018 ident: 2023020108352089400_btz734-B43 article-title: Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique publication-title: Bioinformatics – volume: 43 start-page: 204 year: 2015 ident: 2023020108352089400_btz734-B42 article-title: UniProt: a hub for protein information publication-title: Nucleic Acids Res doi: 10.1093/nar/gku989 – volume: 450 start-page: 86 year: 2018 ident: 2023020108352089400_btz734-B34 article-title: Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou's pseudo-amino acid composition publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2018.04.026 – volume: 42 start-page: 1309 year: 2012 ident: 2023020108352089400_btz734-B8 article-title: Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet publication-title: Amino Acids doi: 10.1007/s00726-010-0825-7 – volume: 5 start-page: 1888 year: 2006 ident: 2023020108352089400_btz734-B14 article-title: Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers publication-title: J. Proteome Res doi: 10.1021/pr060167c – volume: 34 start-page: 477 year: 2018 ident: 2023020108352089400_btz734-B41 article-title: Structure-based prediction of protein-peptide binding regions using Random Forest publication-title: Bioinformatics doi: 10.1093/bioinformatics/btx614 – volume: 269 start-page: 208 year: 2011 ident: 2023020108352089400_btz734-B48 article-title: Prediction of protein submitochondria locations based on data fusion of various features of sequences publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2010.10.026 – volume: 9 year: 2018 ident: 2023020108352089400_btz734-B7 article-title: EGBMMDA: extreme gradient boosting machine for miRNA-disease association prediction publication-title: Cell Death Dis – volume: 19 start-page: 574 year: 2018 ident: 2023020108352089400_btz734-B50 article-title: Imbalance learning for the prediction of N6-Methylation sites in mRNAs publication-title: BMC Genomics doi: 10.1186/s12864-018-4928-y – volume: 416 start-page: 81 year: 2017 ident: 2023020108352089400_btz734-B24 article-title: Predicting protein submitochondrial locations by incorporating the positional-specific physicochemical properties into Chou's general pseudo-amino acid compositions publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2016.12.026 – volume: 240 start-page: 9 year: 2006 ident: 2023020108352089400_btz734-B36 article-title: Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2005.08.016 – volume: 259 start-page: 366 year: 2009 ident: 2023020108352089400_btz734-B49 article-title: Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach publication-title: J. Theor. Biol doi: 10.1016/j.jtbi.2009.03.028 – start-page: 129 year: 1992 ident: 2023020108352089400_btz734-B26 – volume: 2013 start-page: 1. year: 2013 ident: 2023020108352089400_btz734-B17 article-title: SubMito-PSPCP: predicting protein submitochondrial locations by hybridizing positional specific physicochemical properties with pseudoamino acid compositions publication-title: Biomed Res. Int – volume: 56 start-page: 2353 year: 2016 ident: 2023020108352089400_btz734-B38 article-title: Extreme gradient boosting as a method for quantitative structure-activity relationships publication-title: J. Chem. Inf. Model doi: 10.1021/acs.jcim.6b00591 – volume: 77 start-page: 1178 year: 1985 ident: 2023020108352089400_btz734-B21 article-title: The use of multidimensional perceptual models in the selection of sonar echo features publication-title: J. Acoust. Soc. Am doi: 10.1121/1.392182 – volume: 266 start-page: 1043 year: 2010 ident: 2023020108352089400_btz734-B3 article-title: Prediction of protein (domain) structural classes based on amino-acid index publication-title: FEBS J – volume: 7 start-page: 518 year: 2006 ident: 2023020108352089400_btz734-B16 article-title: Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-7-518 – volume: 34 start-page: 653 year: 2008 ident: 2023020108352089400_btz734-B33 article-title: Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization publication-title: Amino Acids doi: 10.1007/s00726-007-0018-1 – volume: 32 start-page: 3107 year: 2016 ident: 2023020108352089400_btz734-B44 article-title: Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw377 – volume: 45 start-page: 113. year: 1996 ident: 2023020108352089400_btz734-B20 article-title: Mitochondria and diabetes. Genetic, biochemical, and clinical implications of the cellular energy circuit publication-title: Diabetes doi: 10.2337/diab.45.2.113 – volume: 33 start-page: 2296 year: 2017 ident: 2023020108352089400_btz734-B22 article-title: NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers publication-title: Bioinformatics doi: 10.1093/bioinformatics/btx164 – volume: 1 start-page: 1 year: 2018 ident: 2023020108352089400_btz734-B23 article-title: Decision tree analysis in subarachnoid hemorrhage: prediction of outcome parameters during the course of aneurysmal subarachnoid hemorrhage using decision tree analysis publication-title: J. Neurosurg – volume: 123 start-page: 424 year: 2014 ident: 2023020108352089400_btz734-B30 article-title: LibD3C: ensemble classifiers with a clustering and dynamic selection strategy publication-title: Neurocomputing doi: 10.1016/j.neucom.2013.08.004 – volume: 21 start-page: 983 year: 2016 ident: 2023020108352089400_btz734-B2 article-title: Bioactive molecule prediction using extreme gradient boosting publication-title: Molecules doi: 10.3390/molecules21080983 – volume: 21 start-page: 10 year: 2005 ident: 2023020108352089400_btz734-B13 article-title: Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth466 – volume: 47 start-page: 329 year: 2015 ident: 2023020108352089400_btz734-B15 article-title: Identification of mitochondrial proteins of malaria parasite using analysis of variance publication-title: Amino Acids doi: 10.1007/s00726-014-1862-4 – volume: 249 start-page: 1 year: 2016 ident: 2023020108352089400_btz734-B1 article-title: Prediction of protein submitochondrial locations by incorporating dipeptide composition into chou’s general pseudo amino acid composition publication-title: J. Membr. Biol doi: 10.1007/s00232-015-9868-8 – year: 2016 ident: 2023020108352089400_btz734-B6 – volume: 11 start-page: 170 year: 2015 ident: 2023020108352089400_btz734-B29 article-title: Protein submitochondrial localization from integrated sequence representation and SVM-based backward feature extraction publication-title: Mol. Biosyst doi: 10.1039/C4MB00340C |
SSID | ssj0051444 ssj0005056 |
Score | 2.6331358 |
Snippet | Abstract
Motivation
Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in... Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical... |
SourceID | proquest pubmed crossref oup |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 1074 |
Title | SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting |
URI | https://www.ncbi.nlm.nih.gov/pubmed/31603468 https://www.proquest.com/docview/2305029724 |
Volume | 36 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEA6LIHgR32-J4MVDtdu8Wm8q6iKsXnaht9KkybIgrezWw_oD_N3O9LGyiqjHwmRCO0nmS2fmG0JOjdE8U6rrheAbPEDg3At1yDwhZKaY4EpVVe_9R9kb8odYxB3it7UwX0P4EbvQ46IhEUXi4gtdvimGBKDgiJEsf_AUf-Z0-MgMUz8AEuB1S1tk9g591tbv_KRywTMtVLt9A52V87lbI6sNaqRXtZnXScfmG2S57iM52yTvsP37sDW9-P66KKblJX2ZYAAGU5ppxcQwzukUNIMMHHd5hquOVm6sKcOkekYd5sCPaJtiSJ2tSD_p_B1ALM0zauMSfyrS0aRKFyupxjlh6BYZ3t0Obnpe02DBM1yy0rPGwQVFc2ai0KVaZ2mkglCLzKWBVTLVQjg_jSw3QdQ1MgJw58Cp8q6LmEGkuE2W8iK3u4Q6oYzKukIbaXkUWC25dIYp0Gh849s9wtuPm5iGfRybYDwndRScJYs2SWqb7JHz-bCXmn7jtwFnYLm_yp609k1gU2GkJM1t8TpN4F4msKtXADI7teHnKhk25uYy3P_HTAdkJcB7OjaSEYdkqZy82iMAM6U-rhbwBwV6-sM |
linkProvider | Oxford University Press |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SubMito-XGBoost%3A+predicting+protein+submitochondrial+localization+by+fusing+multiple+feature+information+and+eXtreme+gradient+boosting&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Yu%2C+Bin&rft.au=Qiu%2C+Wenying&rft.au=Chen%2C+Cheng&rft.au=Ma%2C+Anjun&rft.date=2020-02-15&rft.eissn=1367-4811&rft.volume=36&rft.issue=4&rft.spage=1074&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtz734&rft_id=info%3Apmid%2F31603468&rft.externalDocID=31603468 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |