Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study

BACKGROUND COVID-19 is a scientifically and medically novel disease that is not fully understood because it has yet to be consistently and deeply studied. Among the gaps in research on the COVID-19 outbreak, there is a lack of sufficient infoveillance data. OBJECTIVE The aim of this study was to increase understanding of public awareness of COVID-19 pandemic trends and uncover meaningful themes of concern posted by Twitter users in the English language during the pandemic. METHODS Data mining was conducted on Twitter to collect a total of 107,990 tweets related to COVID-19 between December 13 and March 9, 2020. The analyses included frequency of keywords, sentiment analysis, and topic modeling to identify and explore discussion topics over time. A natural language processing approach and the latent Dirichlet allocation algorithm were used to identify the most common tweet topics as well as to categorize clusters and identify themes based on the keyword analysis. RESULTS The results indicate three main aspects of public awareness and concern regarding the COVID-19 pandemic. First, the trend of the spread and symptoms of COVID-19 can be divided into three stages. Second, the results of the sentiment analysis showed that people have a negative outlook toward COVID-19. Third, based on topic modeling, the themes relating to COVID-19 and the outbreak were divided into three categories: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19. CONCLUSIONS Sentiment analysis and topic modeling can produce useful information about the trends in the discussion of the COVID-19 pandemic on social media as well as alternative perspectives to investigate the COVID-19 crisis, which has created considerable public awareness. This study shows that Twitter is a good communication channel for understanding both public concern and public awareness about COVID-19. These findings can help health departments communicate information to alleviate specific public concerns about the disease.

Download Full-text

How is People's Awareness of “Biodiversity” Measured ?Using Sentiment Analysis and LDA Topic Modeling in the Twitter Discourse Space from 2010 to 2020

10.21203/rs.3.rs-922908/v1 ◽

2021 ◽

Author(s):

Shimon Ohtani

Keyword(s):

Sentiment Analysis ◽

Topic Modeling ◽

Data Science ◽

Latent Dirichlet Allocation ◽

Biological Diversity ◽

Public Awareness ◽

Convention On Biological Diversity ◽

Emotion Lexicon ◽

Aichi Biodiversity Targets ◽

Do So

Abstract The importance of biodiversity conservation is gradually being recognized worldwide, and 2020 was the final year of the Aichi Biodiversity Targets formulated at the 10th Conference of the Parties to the Convention on Biological Diversity (COP10) in 2010. Unfortunately, the majority of the targets were assessed as unachievable. While it is essential to measure public awareness of biodiversity when setting the post-2020 targets, it is also a difficult task to propose a method to do so. This study provides a diachronic exploration of the discourse on “biodiversity” from 2010 to 2020, using Twitter posts, in combination with sentiment analysis and topic modeling, which are commonly used in data science. Through the aggregation and comparison of n-grams, the visualization of eight types of emotional tendencies using the NRC emotion lexicon, the construction of topic models using Latent Dirichlet allocation (LDA), and the qualitative analysis of tweet texts based on these models, I was able to classify and analyze unstructured tweets in a meaningful way. The results revealed the evolution of words used with “biodiversity” on Twitter over the past decade, the emotional tendencies behind the contexts in which “biodiversity” has been used, and the approximate content of tweet texts that have constituted topics with distinctive characteristics. While the search for people's awareness through SNS analysis still has many limitations, it is undeniable that important suggestions can be obtained. In order to further refine the research method, it will be essential to improve the skills of analysts and accumulate research examples as well as to advance data science.

Download Full-text

Topic Modeling and Sentiment Analysis of Electric Vehicles of Twitter Data

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v12i230278 ◽

2021 ◽

pp. 13-29

Author(s):

H. P. Suresha ◽

Krishna Kumar Tiwari

Keyword(s):

Language Processing ◽

Electric Vehicles ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Public Perceptions ◽

Project Work ◽

Common Location ◽

Web App ◽

Twitter Users ◽

Social Media Tool

Twitter is a well-known social media tool for people to communicate their thoughts and feelings about products or services. In this project, I collect electric vehicles related user tweets from Twitter using Twitter API and analyze public perceptions and feelings regarding electric vehicles. After collecting the data, To begin with, as the first step, I built a pre-processed data model based on natural language processing (NLP) methods to select tweets. In the second step, I use topic modeling, word cloud, and EDA to examine several aspects of electric vehicles. By using Latent Dirichlet allocation, do Topic modeling to infer the various topics of electric vehicles. The topic modeling in this study was compared with LSA and LDA, and I found that LDA provides a better insight into topics, as well as better accuracy than LSA.In the third step, the “Valence Aware Dictionary (VADER)” and “sEntiment Reasoner (SONAR)” are used to analyze sentiment of electric vehicles, and its related tweets are either positive, negative, or neutral. In this project, I collected 45000 tweets from Twitter API, related hashtags, user location, and different topics of electric vehicles. Tesla is the top hashtag Twitter users tweeted while sharing tweets related to electric vehicles. Ekero Sweden is the most common location of users related to electric vehicles tweets. Tesla is the most common word in the tweets related to electric vehicles. Elon-musk is the common bi-gram found in the tweets related to electric vehicles. 47.1% of tweets are positive, 42.4% are neutral, and 10.5% are negative as per VADER Finally, I deploy this project work as a fully functional web app.

Download Full-text

How is People's Awareness of “Biodiversity” Measured ?Using Sentiment Analysis and LDA Topic Modeling in the Twitter Discourse Space from 2010 to 2020

10.21203/rs.3.rs-922908/v2 ◽

2021 ◽

Author(s):

Shimon Ohtani

Keyword(s):

Sentiment Analysis ◽

Topic Modeling ◽

Data Science ◽

Latent Dirichlet Allocation ◽

Biological Diversity ◽

Public Awareness ◽

Convention On Biological Diversity ◽

Emotion Lexicon ◽

Aichi Biodiversity Targets ◽

Do So

Abstract The importance of biodiversity conservation is gradually being recognized worldwide, and 2020 was the final year of the Aichi Biodiversity Targets formulated at the 10th Conference of the Parties to the Convention on Biological Diversity (COP10) in 2010. Unfortunately, the majority of the targets were assessed as unachievable. While it is essential to measure public awareness of biodiversity when setting the post-2020 targets, it is also a difficult task to propose a method to do so. This study provides a diachronic exploration of the discourse on “biodiversity” from 2010 to 2020, using Twitter posts, in combination with sentiment analysis and topic modeling, which are commonly used in data science. Through the aggregation and comparison of n-grams, the visualization of eight types of emotional tendencies using the NRC emotion lexicon, the construction of topic models using Latent Dirichlet allocation (LDA), and the qualitative analysis of tweet texts based on these models, I was able to classify and analyze unstructured tweets in a meaningful way. The results revealed the evolution of words used with “biodiversity” on Twitter over the past decade, the emotional tendencies behind the contexts in which “biodiversity” has been used, and the approximate content of tweet texts that have constituted topics with distinctive characteristics. While the search for people's awareness through SNS analysis still has many limitations, it is undeniable that important suggestions can be obtained. In order to further refine the research method, it will be essential to improve the skills of analysts and accumulate research examples as well as to advance data science.

Download Full-text

Exploring Occupation Differences in Reactions to COVID-19 Pandemic on Twitter

Data and Information Management ◽

10.2478/dim-2020-0032 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Yi Zhao ◽

Haixu Xi ◽

Chengzhi Zhang

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Experimental Results ◽

Social Implications ◽

Income Levels ◽

Related Information ◽

The Social ◽

Twitter Users

AbstractCoronavirus disease 2019 (COVID-19) pandemic-related information are flooded on social media, and analyzing this information from an occupational perspective can help us to understand the social implications of this unprecedented disruption. In this study, using a COVID-19-related dataset collected with the Twitter IDs, we conduct topic and sentiment analysis from the perspective of occupation, by leveraging Latent Dirichlet Allocation (LDA) topic modeling and Valence Aware Dictionary and sEntiment Reasoning (VADER) model, respectively. The experimental results indicate that there are significant topic preference differences between Twitter users with different occupations. However, occupation-linked affective differences are only partly demonstrated in our study; Twitter users with different income levels have nothing to do with sentiment expression on covid-19-related topics.

Download Full-text

Customers' experience of purchasing event tickets: mining online reviews based on topic modeling and sentiment analysis

International Journal of Event and Festival Management ◽

10.1108/ijefm-06-2020-0034 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Krzysztof Celuch

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Topic Modeling ◽

Data Science ◽

Latent Dirichlet Allocation ◽

Online Reviews ◽

Third Party ◽

Content Type

PurposeIn search of creating an extraordinary experience for customers, services have gone beyond the means of a transaction between buyers and sellers. In the event industry, where purchasing tickets online is a common procedure, it remains unclear as to how to enhance the multifaceted experience. This study aims at offering a snapshot into the most valued aspects for consumers and to uncover consumers' feelings toward their experience of purchasing event tickets on third-party ticketing platforms.Design/methodology/approachThis is a cross-disciplinary study that applies knowledge from both data science and services marketing. Under the guise of natural language processing, latent Dirichlet allocation topic modeling and sentiment analysis were used to interpret the embedded meanings based on online reviews.FindingsThe findings conceptualized ten dimensions valued by eventgoers, including technical issues, value of core product and service, word-of-mouth, trustworthiness, professionalism and knowledgeability, customer support, information transparency, additional fee, prior experience and after-sales service. Among these aspects, consumers rated the value of the core product and service to be the most positive experience, whereas the additional fee was considered the least positive one.Originality/valueDrawing from the intersection of natural language processing and the status quo of the event industry, this study offers a better understanding of eventgoers' experiences in the case of purchasing online event tickets. It also provides a hands-on guide for marketers to stage memorable experiences in the era of digitalization.

Download Full-text

Topic Modeling for Twitter Users Regarding the "Ruanggguru" Application

Jurnal ILMU DASAR ◽

10.19184/jid.v21i2.17112 ◽

2020 ◽

Vol 21 (2) ◽

pp. 149

Author(s):

Bagus Wicaksono Arianto ◽

Gangga Anuraga

Keyword(s):

Topic Modeling ◽

Public Perception ◽

Latent Dirichlet Allocation ◽

The Public ◽

Allocation Method ◽

Twitter Account ◽

Twitter Users ◽

A Company ◽

Dirichlet Allocation ◽

Expansion Strategies

PT Ruang Raya Indonesia ("Ruangguru") is the largest and most comprehensive technology company in Indonesia that focuses on education-based services. In 2019 there were 15 million Ruangguru users and 300.00 teachers who had joined and were present in 32 provinces in Indonesia. It prepared a number of expansion strategies to become a company valued at more than US $ 1 billion in the next year or two. The purpose of this research is to classify the opinions of Ruangguru users about the services provided so that it can be an evaluation material in improving their services using the latent direchlet allocation method. The data used comes from a collection of tweets of Twitter users in Indonesia using the Twitter API. The Twitter account used in this study is @ruangguru. The results of the analysis showed that the public perception of Twitter users by using latent dirichlet allocation was formed into 28 topics.Keywords: latent dirichlet allocation, ruangguru, twitter.

Download Full-text

486. Understanding Public Perception of COVID-19 Social Distancing on Twitter

Open Forum Infectious Diseases ◽

10.1093/ofid/ofaa439.679 ◽

2020 ◽

Vol 7 (Supplement_1) ◽

pp. S309-S309

Author(s):

Sameh N Saleh ◽

Christoph Lehmann ◽

Samuel McDonald ◽

Mujeeb Basit ◽

Richard J Medford

Keyword(s):

Language Processing ◽

Topic Modeling ◽

Public Perception ◽

Large Scale ◽

Statistical Significance ◽

Leisure Activities ◽

Community Support ◽

Social Distancing ◽

Positive Sentiment ◽

Twitter Users

Abstract Background Managing and changing public opinion and behavior are vital for social distancing to successfully slow transmission of COVID-19, preserve hospital resources, and prevent overwhelming the healthcare system’s resources. We sought to leveraging organic, large-scale discussion on Twitter about social distancing to understand public’s beliefs and opinions on this policy. Methods Between March 27 and April 10, 2020, we sampled 574,903 English tweets that matched the two most trending social distancing hashtags at the time, #socialdistancing and #stayathome. We used natural language processing techniques to conduct a sentiment analysis that identifies tweet polarity and emotions. We also evaluated the subjectivity of tweets and estimated the frequency of discussion of social distancing rules. We then identified clusters of discussion using topic modeling and compared the sentiment by topic. Results There was net positive sentiment toward both #socialdistancing and #stayathome with mean sentiment scores of 0.150 (standard deviation [SD], 0.292) and 0.144 (SD, 0.287) respectively. Tweets were also more likely to be objective (median, 0.40; IQR, 0 to 0.6) with approximately 30% of all tweets labeled as completely objective. Approximately half (50.4%) of all tweets primarily expressed joy and one-fifth expressed fear and surprise each (Figure 1). These trends correlated well with topic clusters identified by frequency including leisure activities and community support (i.e., joy), concerns about food insecurity and effects of the quarantine (i.e., fear), and unpredictability of COVID and its unforeseen implications (i.e., surprise) (Table 1). Table 1. Topic clusters identified by topic modeling. Words contributing to the model are shown in decreasing order of weighting. The topics are labeled manually based on these words. The number of tweets primarily with that topic, mean sentiment, mean subjectivity, and sample tweets are also included. Figure 1. Emotion analysis for all tweets and stratified by tweets with the hashtag #socialdistancing and #stayathome. Comparison between the two hashtags is done using Chi-squared testing. Bonferroni correction was used to define statistical significance at a threshold of p = 0.008 (0.05/n, where n = 6 since 6 comparisons were completed). Conclusion The positive sentiment, preponderance of objective tweets, and topics supporting coping mechanisms led us to believe that Twitter users generally supported social distancing measures in the early stages of their implementation. Disclosures All Authors: No reported disclosures

Download Full-text

Analysis and Visualization Latent Topic on COVID-19 Vaccine Tweet use two-stage topic modeling (Preprint)

10.2196/preprints.30290 ◽

2021 ◽

Author(s):

Faizah Faizah ◽

Bor-Shen Lin

Keyword(s):

Topic Modeling ◽

Public Perception ◽

Latent Dirichlet Allocation ◽

World Health ◽

Two Stage ◽

The Public ◽

Global Pandemic ◽

Difficult Time ◽

Latent Topic ◽

Latent Topics

BACKGROUND The World Health Organization (WHO) declared COVID-19 as a global pandemic on January 30, 2020. However, the pandemic has not been over yet. Furthermore, in the first quartal of 2021, some countries face the third wave of the pandemic. During the difficult time, the development of the vaccines for COVID-19 accelerates rapidly. Understanding the public perception of the COVID-19 Vaccine according to the data collected from social media can widen the perspective on the state of the global pandemic OBJECTIVE This study explores and analyzes the latent topic on COVID-19 Vaccine Tweet posted by individuals from various countries by using two-stage topic modeling. METHODS A two-stage analysis in topic modeling was proposed to investigating people’s reactions in five countries. The first stage is Latent Dirichlet Allocation that produces the latent topics with the corresponding term distributions that facilitate the investigators to understand the main issues or opinions. The second stage then performs agglomerative clustering on the latent topics based on Hellinger distance, which merges close topics hierarchically into topic clusters to visualize those topics in either tree or graph views. RESULTS In general, the topic discussion regarding the COVID-19 Vaccine in five countries is similar. Topic themes such as "first vaccine" and & "vaccine effect" dominate the public discussion. The remarkable point is that people in some countries have some topic themes, such as "politician opinion" and " stay home" in Canada, "emergency" in India, and & "blood clots" in the United Kingdom. The analysis also shows the most popular COVID-19 Vaccine, which is gaining more public interest. CONCLUSIONS With LDA and Hierarchical clustering, two-stage topic modeling is powerful for visualizing the latent topics and understanding the public perception regarding the COVID-19 Vaccine.

Download Full-text

Análise de discursos em notícias sobre homofobia, racismo e sexismo em comentários de portais brasileiros de notícias

10.14210/cotb.v12.p467-474 ◽

2021 ◽

Author(s):

Lucas Rodrigues ◽

Antonio Jacob Junior ◽

Fábio Lobato

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Data Visualization ◽

Language Processing ◽

Topic Modeling ◽

Hate Speech ◽

Psychological Impact ◽

Internet Service ◽

General Law

Posts with defamatory content or hate speech are constantly foundon social media. The results for readers are numerous, not restrictedonly to the psychological impact, but also to the growth of thissocial phenomenon. With the General Law on the Protection ofPersonal Data and the Marco Civil da Internet, service providersbecame responsible for the content in their platforms. Consideringthe importance of this issue, this paper aims to analyze the contentpublished (news and comments) on the G1 News Portal with techniquesbased on data visualization and Natural Language Processing,such as sentiment analysis and topic modeling. The results showthat even with most of the comments being neutral or negative andclassified or not as hate speech, the majority of them were acceptedby the users.

Download Full-text