Mining world knowledge for analysis of search engine content

Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of hundreds of different subjects in a three-level taxonomy covering the documents offered in our university library. We then mine this ontology to fin...

Full description

Saved in:
Bibliographic Details
Published inWeb intelligence and agent systems Vol. 5; no. 3; pp. 233 - 253
Main Authors King, John D., Li, Yuefeng, Tao, Xiaohui, Nayak, Richi
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.08.2007
Subjects
Online AccessGet full text
ISSN1570-1263
1875-9289
DOI10.3233/WEB-2007-wia00115

Cover

Abstract Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of hundreds of different subjects in a three-level taxonomy covering the documents offered in our university library. We then mine this ontology to find important classification rules, and then use these rules to perform an extensive analysis of the content of the largest general purpose internet search engines in use today. Instead of representing documents and collections as a set of terms, we represent them as a set of subjects, which is a highly efficient representation, leading to a more robust representation of information and a decrease of synonymy.
AbstractList Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of hundreds of different subjects in a three-level taxonomy covering the documents offered in our university library. We then mine this ontology to find important classification rules, and then use these rules to perform an extensive analysis of the content of the largest general purpose internet search engines in use today. Instead of representing documents and collections as a set of terms, we represent them as a set of subjects, which is a highly efficient representation, leading to a more robust representation of information and a decrease of synonymy.
Author King, John D.
Tao, Xiaohui
Nayak, Richi
Li, Yuefeng
Author_xml – sequence: 1
  givenname: John D.
  surname: King
  fullname: King, John D.
  organization: School of Software Engineering and Data Communications, Queensland University of Technology, QLD 4001, Australia
– sequence: 2
  givenname: Yuefeng
  surname: Li
  fullname: Li, Yuefeng
  organization: School of Software Engineering and Data Communications, Queensland University of Technology, QLD 4001, Australia
– sequence: 3
  givenname: Xiaohui
  surname: Tao
  fullname: Tao, Xiaohui
  organization: School of Software Engineering and Data Communications, Queensland University of Technology, QLD 4001, Australia
– sequence: 4
  givenname: Richi
  surname: Nayak
  fullname: Nayak, Richi
  organization: School of Software Engineering and Data Communications, Queensland University of Technology, QLD 4001, Australia
BookMark eNo1z91KAzEQBeAgFWyrD-BdXiB1kjQmAW-01B-oeKN4ucxmJ-vWJYHNSvHt3aJezYGBw_kWbJZyIsYuJay00vrqfXsnFIAVhw4BpDQnbC6dNcIr52dTNhaEVNf6jC1K2QPo6avn7Oa5S11q-SEPfcM_Uz701LTEYx44Juy_S1d4jrwQDuGDU2q7RDzkNFIaz9lpxL7Qxd9dsrf77evmUexeHp42tztRJPhROGzWPnrrIWpEWSM5g74ORsUAtQ-eYvAanEJwTpNdK2O9bmRtKARvQS_Z6re3YEvVPn8N07JSSaiO9mqyV0d79W_XP_O5T5w
ContentType Journal Article
Copyright IOS Press. All rights reserved
Copyright_xml – notice: IOS Press. All rights reserved
DOI 10.3233/WEB-2007-wia00115
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1875-9289
EndPage 253
ExternalDocumentID 10.3233_WEB-2007-wia00115
GroupedDBID .4S
.DC
4.4
AAFNC
AAOTM
AAQXI
ABDBF
ABJNI
ABUBZ
ACGFS
ACPQW
ACUHS
ADZMO
AFRHK
AFYTF
AGIAB
AJNRN
ALMA_UNASSIGNED_HOLDINGS
ARCSS
ASPBG
AVWKF
CAG
COF
DU5
E.-
EAD
EAP
EBS
EDO
EJD
EMK
EPL
ESX
FEDTE
HZ~
IL9
IOS
J8X
MET
MIO
MK~
MV1
NGNOM
O9-
P2P
SAUOL
SCNPE
SFC
TUS
ID FETCH-LOGICAL-s109t-8ad49f9790f3aa1bae85a9bc52fc0b9c9efc93082a0883e7425793d1b5ecc9703
ISSN 1570-1263
IngestDate Tue Jun 17 22:26:15 EDT 2025
IsPeerReviewed false
IsScholarly false
Issue 3
Keywords search engines
hierarchal classification
taxonomy
collection selection
Ontology
data mining
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s109t-8ad49f9790f3aa1bae85a9bc52fc0b9c9efc93082a0883e7425793d1b5ecc9703
PageCount 21
ParticipantIDs sage_journals_10_3233_WEB_2007_wia00115
PublicationCentury 2000
PublicationDate 20070800
PublicationDateYYYYMMDD 2007-08-01
PublicationDate_xml – month: 8
  year: 2007
  text: 20070800
PublicationDecade 2000
PublicationPlace London, England
PublicationPlace_xml – name: London, England
PublicationTitle Web intelligence and agent systems
PublicationYear 2007
Publisher SAGE Publications
Publisher_xml – name: SAGE Publications
SSID ssj0031873
Score 1.5030779
Snippet Little is known about the content of the major search engines. We present an automatic learning method which trains an ontology with world knowledge of...
SourceID sage
SourceType Publisher
StartPage 233
Title Mining world knowledge for analysis of search engine content
URI https://journals.sagepub.com/doi/full/10.3233/WEB-2007-wia00115
Volume 5
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA9-XPQgfuI3OQgepNo27dKAF-ccQ9DThvM0kjbFImzgOkT_el8-mla3g3opJaSB9v368vLL770gdBZzqkSLmceIn3lRLogHUWvLY2EQcZ5Imug62w-Prd4guh_Gw5pV0tklpbhMPxfmlfzHqtAGdlVZsn-wrBsUGuAe7AtXsDBcf2Xjh8JwqVoQ49gxK4ysi41YZkPq0oNanF6pXWxU-iSFrhvhanPqCq4q6cpWeq53fIpawXvRuXRyHq0JeJ7JXNqZUJMBmoYdFnzyMisc68w_-GuV0l98Ix2ok7w1d7sW8opBTJXcw3otadpgUQTWN2cFVW43bqCLNF0oIY3ZODSlhH86ehIqIrr7dNc2ZOt7wXVwW89qTmtoe4_m-i6j1ZBStbe_etPutLvVBA5eTgsT3JuYzXA1yNXcIN8EgDom6W-iDbuYwDcGGVtoSY630XqjxOQOujYYwRoj2GEEA0ZwhRE8ybHBCDYYwRYju2jQvevf9jx7YoY3DXxWegnPIpYzyvyccB4ILpOYM5HGYZ76gqVM5ilTBYo4TC5EUuWwGckCEcOfzMD576GV8WQs9xHOOM3gcYgYMwhgEin8IG8lLBZhK4XR6AE6Vy8-sr_DdASryeo7q7NN6aj6RIe_7nmE1mqwHaOV8m0mTyDmK8WptdEXPlpUIw
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Mining+world+knowledge+for+analysis+of+search+engine+content&rft.jtitle=Web+intelligence+and+agent+systems&rft.au=King%2C+John+D.&rft.au=Li%2C+Yuefeng&rft.au=Tao%2C+Xiaohui&rft.au=Nayak%2C+Richi&rft.date=2007-08-01&rft.pub=SAGE+Publications&rft.issn=1570-1263&rft.eissn=1875-9289&rft.volume=5&rft.issue=3&rft.spage=233&rft.epage=253&rft_id=info:doi/10.3233%2FWEB-2007-wia00115&rft.externalDocID=10.3233_WEB-2007-wia00115
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1570-1263&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1570-1263&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1570-1263&client=summon