An Ensemble Learning Framework for Online Web Spam Detection

Most of the existing studies about web spam detection explicitly or implicitly assume that the detection process is performed offline on the search engine side. However, we argue that online web spam detection is even useful in some specific scenarios. We propose to implement a web browser plug-in t...

Full description

Saved in:
Bibliographic Details
Published in2013 12th International Conference on Machine Learning and Applications Vol. 1; pp. 40 - 45
Main Authors Cailing Dong, Bin Zhou
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2013
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Most of the existing studies about web spam detection explicitly or implicitly assume that the detection process is performed offline on the search engine side. However, we argue that online web spam detection is even useful in some specific scenarios. We propose to implement a web browser plug-in to support online web spam detection. Three different sets of spam labeling data are collected and adopted for learning a reliable web spam classifier. An empirical study is conducted on the benchmark web spam data collection. The statistical analysis of the data set verifies the necessity of online web spam detection. The performance of the proposed ensemble learning framework for online web spam detection is also examined and it meets the requirement of online webs Pam detection.
AbstractList Most of the existing studies about web spam detection explicitly or implicitly assume that the detection process is performed offline on the search engine side. However, we argue that online web spam detection is even useful in some specific scenarios. We propose to implement a web browser plug-in to support online web spam detection. Three different sets of spam labeling data are collected and adopted for learning a reliable web spam classifier. An empirical study is conducted on the benchmark web spam data collection. The statistical analysis of the data set verifies the necessity of online web spam detection. The performance of the proposed ensemble learning framework for online web spam detection is also examined and it meets the requirement of online webs Pam detection.
Author Cailing Dong
Bin Zhou
Author_xml – sequence: 1
  surname: Cailing Dong
  fullname: Cailing Dong
  email: cailing.dong@umbc.edu
  organization: Dept. of Inf. Syst., Univ. of Maryland, Baltimore County, Baltimore, MD, USA
– sequence: 2
  surname: Bin Zhou
  fullname: Bin Zhou
  email: bzhou@umbc.edu
  organization: Dept. of Inf. Syst., Univ. of Maryland, Baltimore County, Baltimore, MD, USA
BookMark eNotzD1Lw0AYAOATdNDa0cnl_kDi--a-wSXEthYiHSw4lkvynhwml3INiP_eQadne-7YdZoTMfaAUCKCe9o3b21dVoCiRHXF1s5YMNophVK6W_ZcJ75JF5q6kXhLPqeYPvk2-4m-5_zFw5z5IY0xEf-gjr-f_cRfaKF-iXO6ZzfBjxda_7tix-3m2LwW7WG3b-q2iA6WwocwSIcGQh9QQg_oZUV28ACOrNSiQkFSD14gGml6UpZsp6QMegDlvFixx782EtHpnOPk889JGyuVVeIXDf1DFw
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICMLA.2013.15
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9780769551449
0769551440
EndPage 45
ExternalDocumentID 6784585
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i90t-affd49170fcf140c01a42e8da009e8463213e46da311747ce58e8b544f6d059a3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:05 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-affd49170fcf140c01a42e8da009e8463213e46da311747ce58e8b544f6d059a3
PageCount 6
ParticipantIDs ieee_primary_6784585
PublicationCentury 2000
PublicationDate 2013-Dec.
PublicationDateYYYYMMDD 2013-12-01
PublicationDate_xml – month: 12
  year: 2013
  text: 2013-Dec.
PublicationDecade 2010
PublicationTitle 2013 12th International Conference on Machine Learning and Applications
PublicationTitleAbbrev icmla
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.5670654
Snippet Most of the existing studies about web spam detection explicitly or implicitly assume that the detection process is performed offline on the search engine...
SourceID ieee
SourceType Publisher
StartPage 40
SubjectTerms Browsers
Detectors
ensemble learning
Labeling
online web spam detection
personalization
Search engines
Servers
Unsolicited electronic mail
Web pages
Title An Ensemble Learning Framework for Online Web Spam Detection
URI https://ieeexplore.ieee.org/document/6784585
Volume 1
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB7anjyptOKbPXg0aR6TNAEvpTZUsSJYsbeym8yKaNOiycVf72zSh4gHb8te9gX7fbP7zTcAF4zBJoXRs5gqKwu1-d_VqWsRxirULiqpK7XFfTh6wttpMG3A5SYXhogq8RnZpln95WeLtDRPZV2-WJHpbROavTiuc7W2tpndm8H4rm_EWr5titz-KJZSYUWyC-P1KLVE5M0uC2WnX78MGP87jT3obLPyxMMGb_ahQXkbrvq5GOafNFfvJFZmqS8iWUuuBHNSUduJimdS4nEp5-KaikqAlXdgkgwng5G1qohgvcZOYUmtM-T4ytEpb6OTOq5Ej6JMMlEiJhK-5_qEYSZ9lwONXkpBRJEKEHWYMY2S_gG08kVOhyA85cVhICPN4RKizyDFvEkpJgAURpmLR9A2C58ta8-L2WrNx393n8CO2fda5nEKreKjpDMG60KdV6f0DWP3lNA
link.rule.ids 310,311,783,787,792,793,799,27937,55086
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NTwIxEJ0gHvSkBozf9uDRXfajW5fEC0EI6EJMxMiNtLtTYpSF6HLx1zvdXcAYD96aXvqV9L1p37wBuCIMNimMnkVUWVlcm_9dHbsW8qYS2uVK6lxtMRS9Z34_DsYVuF7nwiBiLj5D2zTzv_xkHi_NU1mDLlZO9HYLtolXh6LI1toYZzb67UHUMnIt3zZlbn-US8nRorsHg9U4hUjkzV5myo6_flkw_nci-1Df5OWxxzXiHEAF0xrctlLWST9xpt6RlXapU9Zdia4YsVJWGIqyF1TsaSFn7A6zXIKV1mHU7YzaPausiWC9Np3MklonnCIsR8e0kU7suJJ7GCaSqBISlfA910cuEum7FGrcxBiEGKqAcy0SIlLSP4RqOk_xCJinvKYIZKgpYOLcJ5gi5qQUUQAUYeLyY6iZhU8WhevFpFzzyd_dl7DTGw2iSdQfPpzCrjmDQvRxBtXsY4nnBN2ZushP7BtfDJgb
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+12th+International+Conference+on+Machine+Learning+and+Applications&rft.atitle=An+Ensemble+Learning+Framework+for+Online+Web+Spam+Detection&rft.au=Cailing+Dong&rft.au=Bin+Zhou&rft.date=2013-12-01&rft.pub=IEEE&rft.volume=1&rft.spage=40&rft.epage=45&rft_id=info:doi/10.1109%2FICMLA.2013.15&rft.externalDocID=6784585