Using search trends to analyze web-based users' behavior profiles connected with COVID-19 in mainland China: infodemiology study based on hot words and Baidu Index

Mainland China, the world's most populous region, experienced a large-scale coronavirus disease 2019 (COVID-19) outbreak in 2020 and 2021, respectively. Existing infodemiology studies have primarily concentrated on the prospective surveillance of confirmed cases or symptoms which met the criter...

Full description

Saved in:
Bibliographic Details
Published inPeerJ (San Francisco, CA) Vol. 10; p. e14343
Main Authors Jiang, Shuai, You, Changqiao, Zhang, Sheng, Chen, Fenglin, Peng, Guo, Liu, Jiajie, Xie, Daolong, Li, Yongliang, Guo, Xinhong
Format Journal Article
LanguageEnglish
Published United States PeerJ. Ltd 09.11.2022
PeerJ, Inc
PeerJ Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mainland China, the world's most populous region, experienced a large-scale coronavirus disease 2019 (COVID-19) outbreak in 2020 and 2021, respectively. Existing infodemiology studies have primarily concentrated on the prospective surveillance of confirmed cases or symptoms which met the criterion for investigators; nevertheless, the actual impact regarding COVID-19 on the public and subsequent attitudes of different groups towards the COVID-19 epidemic were neglected. This study aimed to examine the public web-based search trends and behavior patterns related to COVID-19 outbreaks in mainland China by using hot words and Baidu Index (BI). The initial hot words (the high-frequency words on the Internet) and the epidemic data (2019/12/01-2021/11/30) were mined from infodemiology platforms. The final hot words table was established by two-rounds of hot words screening and double-level hot words classification. Temporal distribution and demographic portraits of COVID-19 were queried by search trends service supplied from BI to perform the correlation analysis. Further, we used the parameter estimation to quantitatively forecast the geographical distribution of COVID-19 in the future. The final English-Chinese bilingual table was established including six domains and 32 subordinate hot words. According to the temporal distribution of domains and subordinate hot words in 2020 and 2021, the peaks of searching subordinate hot words and COVID-19 outbreak periods had significant temporal correlation and the subordinate hot words in COVID-19 Related and Territory domains were reliable for COVID-19 surveillance. Gender distribution results showed that Territory domain (the male proportion: 67.69%; standard deviation (SD): 5.88%) and Symptoms/Symptom and Public Health (the female proportion: 57.95%, 56.61%; SD: 0, 9.06%) domains were searched more by male and female groups respectively. The results of age distribution of hot words showed that people aged 20-50 (middle-aged people) had a higher online search intensity, and the group of 20-29, 30-39 years old focused more on Media and Symptoms/Symptom (proportion: 45.43%, 51.66%; SD: 15.37%, 16.59%) domains respectively. Finally, based on frequency rankings of searching hot words and confirmed cases in Mainland China, the epidemic situation of provinces and Chinese administrative divisions were divided into 5 levels of early-warning regions. Central, East and South China regions would be impacted again by the COVID-19 in the future.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2167-8359
2167-8359
DOI:10.7717/peerj.14343