An Evolutionary Clustering Analysis of Social Media Content and Global Infection Rates During the COVID-19 Pandemic
Abstract
This study investigates the impact of global infection rates on social media posts during the COVID-19 pandemic. The study analysed over 179 million tweets posted between March 22 and April 13, 2020 and the global COVID-19 infection rates using evolutionary clustering analysis. Results showed six clusters constructed for each term type, including three-level -grams (unigrams, bigrams and trigrams). The frequent occurrences of unigrams (“COVID-19”, “virus”, “government”, “people”, etc.), bigrams (“COVID 19”, “COVID-19 cases”, “times share”, etc.) and trigrams (“COVID 19 crisis”, “things help stop” and “trying times share”) were identified. The results demonstrated that the unigram trends on Twitter were up to about two times and 54 times more common than the bigram terms and trigram terms, respectively. Unigrams like “home” or “need” also became important as these terms reflected the main concerns of people during this period. Taken together, the present findings confirm that many tweets were used to broadcast people’s prevalent topics of interest during the COVID-19 pandemic. Furthermore, the results indicate that the number of COVID-19 infections had a significant effect on all clusters, being strong on 86% of clusters and moderate on 16% of clusters. The downward slope in global infection rates reflected the start of the trending of “social distancing” and “stay at home”. These findings suggest that infection rates have had a significant impact on social media posting during the COVID-19 pandemic.
References
- 2020] COVID-19 and the 5G conspiracy theory: Social network analysis of Twitter data. Journal of Medical Internet Research, 22(5), e19458. https://doi.org/10.2196/19458. Crossref, Web of Science, Google Scholar [
- 2020] Analysis of Twitter data using evolutionary clustering during the COVID-19 pandemic. Computers, Materials & Continua, 65(1), 193–204. https://doi.org/10.32604/cmc.2020.011489. Crossref, Web of Science, Google Scholar [
- 2021] COVID-19 phobia in the United States: Validation of the COVID-19 Phobia Scale (C19P-SE). Death Studies. https://doi.org/10.1080/07481187.2020.1848945. Crossref, Web of Science, Google Scholar [
- Banda, JM and T Ramya (2020). A Twitter Dataset of 40+ million tweets related to COVID-19. Available at https://github.com/thepanacealab/covid19_twitter. Accessed on April 18, 2020. Google Scholar
- 2001]
A support vector method for clustering . In Advances in Neural Information Processing Systems 13, TK Leen, TG Dietterich and V Tresp (eds.), pp. 367–373. Cambridge, MA: The MIT Press. Google Scholar [ - 2012] Statistics in a Nutshell: A Desktop Quick Reference. Sebastopol, CA: O’Reilly Media, Inc. Google Scholar [
- 2014] Tweeting the terror: modelling the social media reaction to the Woolwich terrorist attack. Social Network Analysis and Mining, 4, 206. https://doi.org/10.1007/s13278-014-0206-4. Crossref, Google Scholar [
- 2020] Mining Twitter to explore the emergence of COVID-19 symptoms. Public Health Nursing, 37(6), 934–940. https://doi.org/10.1111/phn.12809. Crossref, Web of Science, Google Scholar [
- 2020] Credibility detection in Twitter using word -gram analysis and supervised machine learning techniques. International Journal of Intelligent Engineering and Systems, 13(1), 291–300. https://doi.org/10.22266/ijies2020.0229.27. Crossref, Google Scholar [
- 2020] COVID-19-related infodemic and its impact on public health: A global social media analysis. The American Journal of Tropical Medicine and Hygiene, 103(4), 1621–1629. https://doi.org/10.4269/ajtmh.20-0812. Crossref, Web of Science, Google Scholar [
- 2020] Rapid relevance classification of social media posts in disasters and emergencies: A system and evaluation featuring active, incremental and online learning. Information Processing & Management, 57(1), 102132. https://doi.org/10.1016/j.ipm.2019.102132. Crossref, Web of Science, Google Scholar [
- 2020] Automated measurement of attitudes towards social distancing using social media: A COVID-19 case study. First Monday, 25(11), 10599. https://doi.org/10.20944/preprints202004.0057.v1 Google Scholar [
- 1990] The self-organizing map. Proceedings of the IEEE, 78(9), 1464–1480. Crossref, Web of Science, Google Scholar [
- 2018] Disaster response aided by tweet classification with a domain adaptation approach. Journal of Contingencies and Crisis Management, 26(1), 16–27. https://doi.org/10.1111/1468-5973.12194. Crossref, Web of Science, Google Scholar [
- 2020] Data mining and content analysis of the Chinese social media platform Weibo during the early COVID-19 outbreak: Retrospective observational infoveillance study. JMIR Public Health and Surveillance, 6(2), e18700. https://doi.org/10.2196/18700. Crossref, Google Scholar [
- 2006] Comparing SOM neural network with Fuzzy -means, -means and traditional hierarchical clustering algorithms. European Journal of Operational Research, 174(3), 1742–1759. https://doi.org/10.1016/j.ejor.2005.03.039. Crossref, Web of Science, Google Scholar [
- Nayar, KR, L Sadasivan, M Shaffi, B Vijayan and AP Rao (2020). Social media messages related to COVID-19: A content analysis. Available at https://ssrn.com/abstract=3560666. Accessed on May 3, 2020. Google Scholar
- 2020] Prediction of number of cases of 2019 novel coronavirus (COVID-19) using social media search index. International Journal of Environmental Research and Public Health, 17(7), 2365. https://doi.org/10.3390/ijerph17072365. Crossref, Web of Science, Google Scholar [
- 2010] Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. New York, NY: ACM. Crossref, Google Scholar [
- 2020] Qualitative analysis of a mathematical model in the time of COVID-19. BioMed Research International, 2020, 5098598. https://doi.org/10.1155/2020/5098598. Crossref, Web of Science, Google Scholar [
- 2020] Covid-19’s impact on supply chain decisions: Strategic insights from NASDAQ 100 firms using Twitter data. Journal of Business Research, 117, 443–449. https://doi.org/10.1016/j.jbusres.2020.05.035. Crossref, Web of Science, Google Scholar [
- 2020] Nature and diffusion of COVID-19-related oral health information on Chinese social media: Analysis of tweets on Weibo. Journal of Medical Internet Research, 22(6), e19981. https://doi.org/10.2196/19981. Crossref, Web of Science, Google Scholar [
- 2020] Impact of rumors or misinformation on coronavirus disease (COVID-19) in social media. Journal of Preventive Medicine and Public Health, 53(3), 171–174. https://doi.org/10.3961/jpmph.20.094. Crossref, Google Scholar [
- 2020] Social media insights into US mental health during the COVID-19 pandemic: Longitudinal analysis of Twitter data. Journal of Medical Internet Research, 22(12), e21418. https://doi.org/10.2196/21418. Crossref, Web of Science, Google Scholar [
- 2020] Travellers give wings to novel coronavirus (2019-nCoV). Journal of Travel Medicine, 27(2), taaa015. https://doi.org/10.1093/jtm/taaa015. Crossref, Web of Science, Google Scholar [
- 2007] SVM clustering. BMC Bioinformatics, 8, S18. https://doi.org/10.1186/1471-2105-8-S7-S18. Crossref, Web of Science, Google Scholar [
-
W Meira [2014] Data Mining and Analysis: Fundamental Concepts and Algorithms. New York, NY: Cambridge University Press. Crossref, Google Scholar and - 2020] How to fight an infodemic. The Lancet (London, England), 395(10225), P676. https://doi.org/10.1016/S0140-6736(20)30461-X. Crossref, Web of Science, Google Scholar [