Concept

A scientometric overview of CORD-19 data analysis - topic modeling

  • Looked at the topics of 140,302 articles, as defined by the probability distribution over a vocabulary of publication titles and abstracts
  • Latent Dirichlet Allocation (LDA) model, determined 15 topics worth observing
  • Literature published in 2020 mostly focused on COVID-19, its outbreak, management, and consequences
  • Concluded that coronavirus research becomes more popular after an outbreak
  • Used agglomerative clustering to further divide up topics and discovered that CORD-19 contains mostly publications on coronaviruses compared to other literature on health, epidemics, and clinical medicine in general
  • How active CORD-19 is largely depends on the time and outbreak
  • COVID-19 outbreak has generated far more literature compared to past coronavirus outbreaks

0

1

Updated 2021-03-02

Tags

CSCW (Computer-supported cooperative work)

Computing Sciences