How to Measure the Consistency of the Tagging of Scientific Papers?
A collection of scientific papers is usually accompanied by tags (keywords, topics, concepts etc.), associated with each paper. Sometimes these tags are human-generated, sometimes they are machine-generated. The evaluation of the tagging quality is an important problem. We propose a simple metrics o...
Saved in:
Published in | 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL) pp. 372 - 373 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.06.2019
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/JCDL.2019.00076 |
Cover
Abstract | A collection of scientific papers is usually accompanied by tags (keywords, topics, concepts etc.), associated with each paper. Sometimes these tags are human-generated, sometimes they are machine-generated. The evaluation of the tagging quality is an important problem. We propose a simple metrics of tagging consistency for scientific papers: whether these tags are predictive of citations. Since the authors tend to cite papers about the topics close to those of their publications, a consistent tagging should be able to predict citations. We present an algorithm to calculate consistency, and show experiments with human-and machine-generated tags. We show that the addition of machine-generated tags to the manual ones can enhance tagging consistency. We further introduce cross-consistency metrics, the ability to predict citation links between papers tagged by different taggers, e.g. humans and computers. Cross-consistency metrics can be used to evaluate tagging quality of a tagger when the amount of labeled data by the known good tagger is limited. |
---|---|
AbstractList | A collection of scientific papers is usually accompanied by tags (keywords, topics, concepts etc.), associated with each paper. Sometimes these tags are human-generated, sometimes they are machine-generated. The evaluation of the tagging quality is an important problem. We propose a simple metrics of tagging consistency for scientific papers: whether these tags are predictive of citations. Since the authors tend to cite papers about the topics close to those of their publications, a consistent tagging should be able to predict citations. We present an algorithm to calculate consistency, and show experiments with human-and machine-generated tags. We show that the addition of machine-generated tags to the manual ones can enhance tagging consistency. We further introduce cross-consistency metrics, the ability to predict citation links between papers tagged by different taggers, e.g. humans and computers. Cross-consistency metrics can be used to evaluate tagging quality of a tagger when the amount of labeled data by the known good tagger is limited. |
Author | Veytsman, Boris |
Author_xml | – sequence: 1 givenname: Boris surname: Veytsman fullname: Veytsman, Boris organization: Chan Zuckerberg Initiative |
BookMark | eNotjMtKAzEUQCMoaGvXLtzkBzrePJo7WYmMjyojCtZ1STI3Y0AzZTIi_XvxsTqcszgzdpiHTIydCaiEAHvx0Fy3lQRhKwBAc8BmAmUtxEojHrNFKcmDBtRKrswJa9bDF58G_kiufI7EpzfizZBLKhPlsOdD_E0b1_cp9z_6EhLlKcUU-LPb0VguT9lRdO-FFv-cs9fbm02zXrZPd_fNVbt0UuO0dCFKULrrEEBFZYIno4SVSikdwCpEjFFF14E2vjbSe-cg2IDeyhCiUXN2_vdNRLTdjenDjfttjVYIU6tvzEdJpg |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/JCDL.2019.00076 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1728115477 9781728115474 |
EndPage | 373 |
ExternalDocumentID | 8791168 |
Genre | orig-research |
GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIL |
ID | FETCH-LOGICAL-a247t-acf2034dd7003f36cbe631923334c093777ff3fad046b862bbaa0c9c7b92ccf63 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:54:29 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a247t-acf2034dd7003f36cbe631923334c093777ff3fad046b862bbaa0c9c7b92ccf63 |
PageCount | 2 |
ParticipantIDs | ieee_primary_8791168 |
PublicationCentury | 2000 |
PublicationDate | 2019-Jun |
PublicationDateYYYYMMDD | 2019-06-01 |
PublicationDate_xml | – month: 06 year: 2019 text: 2019-Jun |
PublicationDecade | 2010 |
PublicationTitle | 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL) |
PublicationTitleAbbrev | JCDL |
PublicationYear | 2019 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib040743256 |
Score | 1.7047151 |
Snippet | A collection of scientific papers is usually accompanied by tags (keywords, topics, concepts etc.), associated with each paper. Sometimes these tags are... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 372 |
SubjectTerms | tagging tagging evaluation topic modeling |
Title | How to Measure the Consistency of the Tagging of Scientific Papers? |
URI | https://ieeexplore.ieee.org/document/8791168 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELXaTkyAWsS3PDCSNont2J4YClVVUdShlbpV_joGpKaCRAh-PbbTFoQY2BIvTmKf3vPl3TuEbpRjlntcTVJudUKZIoliKktAg2WGSSUhun0-FeMFnSzZsoVu97UwzrkoPnP9cBn_5dvS1CFVNhDch2Yh2qjtt1lTq7XbOzRAoYfvrXtPlsrBZHj_GLRbwZAyDZ4iP9qnRPQYHaLpbt5GNPLSryvdN5-_LBn_-2BHqPddp4dnewQ6Ri237qLhuHzHVYmnTfYPe4aHY1vOt0CPP3AJcWiuQq75OdzG-I6aITxTG88H73poMXqYD8fJtlVConLKq0QZyFNCreU-SoEURruCBPJGCDWppyCcAxBQ1h-HtT_EaK1UaqThWubGQEFOUGddrt0pwj7KdS6YtgCUcjBacJMV0gkpMgaZOEPd8AFWm8YNY7V99_O_hy_QQViCRlx1iTrVa-2uPIxX-jqu3xcOGp3X |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMsAEqEV844GRtElsx8nEUKhCaasOrdSt8icDUlNBIgS_HttpC0IMbLGXOHFO7_ny7h3ADddUMYurQciUCAjlOOCUR4ERRlFJM54Z7_Y5TvIZGczpvAG321oYrbUXn-mOu_T_8lUhK5cq66bMhmaS7sCuxX1C62qtzddDHBhaAF_790Rh1h307odOveUsKUPnKvKjgYrHj_4BjDZ3rmUjL52qFB35-cuU8b9LO4T2d6Uemmwx6AgaetmCXl68o7JAozr_hyzHQ74x55sjyB-oMH5qyl22-dkNfYR71RCa8JVlhHdtmPUfpr08WDdLCHhMWBlwaeIQE6WYjVODEyl0gh19w5jI0JIQxozBhit7IBb2GCME56HMJBNZLKVJ8DE0l8VSnwCycS7ilAplDCHMSJEyGSWZTrM0oiZKT6HlXsBiVfthLNbPfvb39DXs5dPRcDF8HD-dw77bjlpqdQHN8rXSlxbUS3Hl9_ILhIOhJA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+ACM%2FIEEE+Joint+Conference+on+Digital+Libraries+%28JCDL%29&rft.atitle=How+to+Measure+the+Consistency+of+the+Tagging+of+Scientific+Papers%3F&rft.au=Veytsman%2C+Boris&rft.date=2019-06-01&rft.pub=IEEE&rft.spage=372&rft.epage=373&rft_id=info:doi/10.1109%2FJCDL.2019.00076&rft.externalDocID=8791168 |