Phylogeny inference based on spectral graph clustering

Phylogeny inference is an importance issue in computational biology. Some early approaches based on characteristics such as the maximum parsimony algorithm and the maximum likelihood algorithm will become intractable when the number of taxonomic units is large. Recent algorithms based on distance da...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology Vol. 18; no. 4; p. 627
Main Authors Zhang, Shu-Bo, Zhou, Song-Yu, He, Jian-Guo, Lai, Jian-Huang
Format Journal Article
LanguageEnglish
Published United States 01.04.2011
Subjects
Online AccessGet more information

Cover

Loading…
Abstract Phylogeny inference is an importance issue in computational biology. Some early approaches based on characteristics such as the maximum parsimony algorithm and the maximum likelihood algorithm will become intractable when the number of taxonomic units is large. Recent algorithms based on distance data which adopt an agglomerative scheme are widely used for phylogeny inference. However, they have to recursively merge the nearest pair of taxa and estimate a distance matrix; this may enlarge the error gradually, and lead to an inaccurate tree topology. In this study, a splitting algorithm is proposed for phylogeny inference by using the spectral graph clustering (SGC) technique. The SGC algorithm splits graphs by using the maximum cut criterion and circumvents optimization problems through solving a generalized eigenvalue system. The promising features of the proposed algorithm are the following: (i) using a heuristic strategy for constructing phylogenies from certain distance functions, which are not even additive; (ii) distance matrices do not have to be estimated recursively; (iii) inferring a more accurate tree topology than that of the Neighbor-joining (NJ) algorithm on simulated datasets; and (iv) strongly supporting hypotheses induced by other methods for Baculovirus genomes. Our numerical experiments confirm that the SGC algorithm is efficient for phylogeny inference.
AbstractList Phylogeny inference is an importance issue in computational biology. Some early approaches based on characteristics such as the maximum parsimony algorithm and the maximum likelihood algorithm will become intractable when the number of taxonomic units is large. Recent algorithms based on distance data which adopt an agglomerative scheme are widely used for phylogeny inference. However, they have to recursively merge the nearest pair of taxa and estimate a distance matrix; this may enlarge the error gradually, and lead to an inaccurate tree topology. In this study, a splitting algorithm is proposed for phylogeny inference by using the spectral graph clustering (SGC) technique. The SGC algorithm splits graphs by using the maximum cut criterion and circumvents optimization problems through solving a generalized eigenvalue system. The promising features of the proposed algorithm are the following: (i) using a heuristic strategy for constructing phylogenies from certain distance functions, which are not even additive; (ii) distance matrices do not have to be estimated recursively; (iii) inferring a more accurate tree topology than that of the Neighbor-joining (NJ) algorithm on simulated datasets; and (iv) strongly supporting hypotheses induced by other methods for Baculovirus genomes. Our numerical experiments confirm that the SGC algorithm is efficient for phylogeny inference.
Author Lai, Jian-Huang
He, Jian-Guo
Zhang, Shu-Bo
Zhou, Song-Yu
Author_xml – sequence: 1
  givenname: Shu-Bo
  surname: Zhang
  fullname: Zhang, Shu-Bo
  organization: Department of Computer Science, Maritime College, Guangzhou, PR China
– sequence: 2
  givenname: Song-Yu
  surname: Zhou
  fullname: Zhou, Song-Yu
– sequence: 3
  givenname: Jian-Guo
  surname: He
  fullname: He, Jian-Guo
– sequence: 4
  givenname: Jian-Huang
  surname: Lai
  fullname: Lai, Jian-Huang
BackLink https://www.ncbi.nlm.nih.gov/pubmed/21352066$$D View this record in MEDLINE/PubMed
BookMark eNo1z7tOwzAUgGELgegFRlbkF0g4tmM7HlHFTarUDjBXtnOcBiVOZKdD3p4BmP7tk_4NuY5jREIeGJQMavPkB1dyAFMC8PqKrJmUuqiVUiuyyfkbgAkF-pasOBOSg1Jroo7npR9bjAvtYsCE0SN1NmNDx0jzhH5OtqdtstOZ-v6SZ0xdbO_ITbB9xvu_bsnX68vn7r3YH94-ds_7wldczgXzwYlgPKARFjAwCUILZaUKwih0PHjtwDS1NyiYN8yGyjQGua6c1iLwLXn8daeLG7A5TakbbFpO_wP8B-TBR7s
CitedBy_id crossref_primary_10_1093_bioinformatics_bts098
crossref_primary_10_1016_j_bpj_2014_11_003
crossref_primary_10_1128_microbiolspec_MTBP_0008_2016
crossref_primary_10_3389_fevo_2014_00072
crossref_primary_10_1016_j_ympev_2022_107636
crossref_primary_10_1088_1751_8113_46_36_365102
crossref_primary_10_1007_s10479_017_2456_9
crossref_primary_10_1093_imaiai_iaad032
crossref_primary_10_1093_sysbio_syz049
crossref_primary_10_1016_j_gene_2014_12_062
crossref_primary_10_1089_cmb_2011_0197
ContentType Journal Article
Copyright Mary Ann Liebert, Inc.
Copyright_xml – notice: Mary Ann Liebert, Inc.
DBID CGR
CUY
CVF
ECM
EIF
NPM
DOI 10.1089/cmb.2009.0028
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
DatabaseTitleList MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Biology
Mathematics
EISSN 1557-8666
ExternalDocumentID 21352066
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
0R~
29K
34G
39C
4.4
53G
5GY
ABBKN
ABEFU
ACGFO
ADBBV
AENEX
AFOSN
AI.
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BNQNF
CAG
CGR
COF
CS3
CUY
CVF
D-I
DIK
DU5
EBS
ECM
EIF
EJD
F5P
IAO
IER
IGS
IHR
IM4
ITC
MV1
NPM
NQHIM
O9-
P2P
R.V
RIG
RML
RMSOB
RNS
TN5
TR2
UE5
VH1
ID FETCH-LOGICAL-c425t-1cfb3f9c0e93a0ef1503736a56f396eb2fc7b09d8c9e31c91af49d9e274b773f2
IngestDate Thu Apr 03 07:03:53 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License Mary Ann Liebert, Inc.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c425t-1cfb3f9c0e93a0ef1503736a56f396eb2fc7b09d8c9e31c91af49d9e274b773f2
PMID 21352066
ParticipantIDs pubmed_primary_21352066
PublicationCentury 2000
PublicationDate 2011-Apr
PublicationDateYYYYMMDD 2011-04-01
PublicationDate_xml – month: 04
  year: 2011
  text: 2011-Apr
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of computational biology
PublicationTitleAlternate J Comput Biol
PublicationYear 2011
SSID ssj0013607
Score 2.0090723
Snippet Phylogeny inference is an importance issue in computational biology. Some early approaches based on characteristics such as the maximum parsimony algorithm and...
SourceID pubmed
SourceType Index Database
StartPage 627
SubjectTerms Algorithms
Animals
Cluster Analysis
Computational Biology - methods
Humans
Models, Genetic
Phylogeny
Title Phylogeny inference based on spectral graph clustering
URI https://www.ncbi.nlm.nih.gov/pubmed/21352066
Volume 18
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6uIuhBfL-lB28STZs2bY4q6iIqHlxYT0uSTfSgreD2oL_eSdKHuirqpZRJH2m_6WQyzXyD0K7gQhohBLZkcTgGDxtLwlMMc2eVZTDikcTmO19esW4vPu8n_XYpr8suGcl99fplXsl_UAUZ4GqzZP-AbHNREMA-4AtbQBi2v8L4-h6m29D64tZUeb5YOywN7S8Al0Np0-8dJ_WeeigtJ0I9Uo37o8rVd6hjgxU503hc-b7ER0UrLkonLfI7fFu2gVWnHKB5-KxsDr7wla-duFuKqiPDNoTaLFTRlZVMYGhjvlzKuBltYwTOJjKf_D9mq0lmqU7Vo6xZQ32W-Dvcnh4dcFEIXiLxd_u59RN1dt3UQR2YRNiqqDaUU_9iYiStSFehJwcf-uEoov25n6Ybzu24mUdzFT7BoQd_AU3ofBFN-wqiL4to9rKh3X1eQqxRiKBRiMApRFDkQa0QgVOIoFWIZdQ7Pbk57uKqMAZWYGJHOFRGUsMV0ZwKog049TSlTCTMUM60jIxK4YsbZoprGioeChPzIddRGss0pSZaQZN5kes1FEQp54JmJBGMxkQQScNEC2KUYILHTK6jVf_4gyfPfjKoX8zGty2baKZVmy00ZeBz09vgu43kjsPgDbNtQs8
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Phylogeny+inference+based+on+spectral+graph+clustering&rft.jtitle=Journal+of+computational+biology&rft.au=Zhang%2C+Shu-Bo&rft.au=Zhou%2C+Song-Yu&rft.au=He%2C+Jian-Guo&rft.au=Lai%2C+Jian-Huang&rft.date=2011-04-01&rft.eissn=1557-8666&rft.volume=18&rft.issue=4&rft.spage=627&rft_id=info:doi/10.1089%2Fcmb.2009.0028&rft_id=info%3Apmid%2F21352066&rft_id=info%3Apmid%2F21352066&rft.externalDocID=21352066