Harvesting Big Data in social science: A methodological approach for collecting online user-generated content

Online user-generated content is playing a progressively important role as information source for social scientists seeking for digging out value. Advances procedures and technologies to enable the capture, storage, management, and analysis of the data make possible to exploit increasing amounts of...

Full description

Saved in:
Bibliographic Details
Published inComputer standards and interfaces Vol. 46; pp. 79 - 87
Main Authors Olmedilla, M., Martínez-Torres, M.R., Toral, S.L.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Online user-generated content is playing a progressively important role as information source for social scientists seeking for digging out value. Advances procedures and technologies to enable the capture, storage, management, and analysis of the data make possible to exploit increasing amounts of data generated directly by users. In that regard, Big Data is gaining meaning into social science from quantitative datasets side, which differs from traditional social science where collecting data has always been hard, time consuming, and resource intensive. Hence, the emergent field of computational social science is broadening researchers' perspectives. However, it also requires a multidisciplinary approach involving several and different knowledge areas. This paper outlines an architectural framework and methodology to collect Big Data from an electronic Word-of-Mouth (eWOM) website containing user-generated content. Although the paper is written from the social science perspective, it must be also considered together with other complementary disciplines such as data accessing and computing. •The emergent field of e-social science requires a multidisciplinary approach.•Architectural framework and methodology for collecting user-generated content•Advantages of web scraping instead of APIs•Results about the time spent on data gathering and the relational database design
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0920-5489
1872-7018
DOI:10.1016/j.csi.2016.02.003