Enriching Scientific Publications from LOD Repositories Through Word Embeddings Approach

The era of digitalization is increasingly emphasizing the role of Digital Libraries (DL), by increasing requirements and expectations of services provided by them. The interoperability among repositories and other resources continues to be a subject of research in the field. Retrieving publications...

Full description

Saved in:

Bibliographic Details
Published in	Metadata and Semantics Research Vol. 672; pp. 278 - 290
Main Authors	Hajra, Arben, Tochtermann, Klaus
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2016 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Data mining Digital Libraries Information retrieval Linked Open Data Recommended systems Semantic web Word embeddings
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The era of digitalization is increasingly emphasizing the role of Digital Libraries (DL), by increasing requirements and expectations of services provided by them. The interoperability among repositories and other resources continues to be a subject of research in the field. Retrieving publications related to a particular topic from different DLs, especially from diverse domains, require several clicks and online visits of many different points of access. However, achieving interoperability by cross-linking publications, authors and other related data would facilitate the scholarly communication in general. Starting from a single point, a scholar would be able to find resources i.e., publications and authors, previously enriched with several other information from different repositories. Repositories available as semantic web content, such as bibliographic Linked Open Data (LOD) datasets are the focus of this study. Primarily, we consider existing alignments among concepts between repositories. Improvements regarding the semantic measurements of relatedness of different resources are possible by the application of text-mining techniques. The paper introduces preliminary experiments conducted by vector space models through the application of TF-IDF and Cosine Similarity (CS). Additionally, the paper discusses experiments of applying a word embedding approach, with which we are focusing mainly on the context by distributed word representations, instead of word frequency, weighting and string matching. We apply the contemporary Word2Vec model as a similar deep learning approach to model semantic word representations.
ISBN:	3319491563 9783319491561
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-319-49157-8_24