Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models

The University of Texas Health Science Center at Houston (Du, Xiang, Zhi, Xu, Song, Tao); Texas A&M University (Tang)
"The proposed scheme can successfully classify the public's opinions and emotions in multiple dimensions, which would facilitate the timely understanding of public perceptions during the outbreak of an infectious disease."
Using Twitter data, this study sought to develop an automate system for a comprehensive public perception analysis of the large-scale measles outbreak in early 2015 in California, the United States (US). Researchers believe that increasing rates of vaccination refusal and undervaccination have made the US public more vulnerable to this potentially deadly disease. To help guide risk communication efforts in future infectious disease outbreaks, this study sought specifically to demonstrate the superiority of convolutional neural network (CNN) models (compared with conventional machine learning methods) on measles-outbreak-related tweets classification tasks.
As the researchers explain, during an outbreak of an infectious disease such as measles, responsible public health agencies need to send out timely messages to the public during different stages of the crisis. For instance, the Centers for Disease Control and Prevention (CDC) has adopted a 5-stage model of crisis and emergency risk communication, including precrisis, initial event, maintenance, resolution, and evaluation. The idea is that prompt understanding of the public's perceptions allows public health agencies to respond to people's attitudes, emotions, and needs in real time instead of relying on a predetermined timeline based on stages.
However, using traditional methods such as surveys to study public perceptions during an infectious disease outbreak is both costly and time-consuming. In contrast, examining Twitter content can provide an immediate assessment of the public's response and enable public health professionals to adapt their messages to communicate with the public more effectively. Twitter, one of the largest public social media in the world, provides insights into how the public responds to an infectious disease outbreak as users, in real time, share information about the outbreak, talk about their personal experiences, argue over the necessity and safety of vaccination, and express a wide range of emotions.
The researchers first designed a comprehensive scheme for the analysis of public perception of measles based on tweets, including 3 dimensions: discussion themes, emotions expressed, and attitude toward vaccination. All 1,154,156 tweets containing the word "measles" posted between December 1 2014 and April 30 2015, were purchased and downloaded from DiscoverText.com. Two expert annotators curated a gold standard of 1,151 tweets (approximately 0.1% of all tweets) based on the 3-dimensional scheme. Next, a tweet classification system based on the CNN framework was developed. The researchers compared the performance of the CNN models to those of 4 conventional machine learning models and another neural network model. They also compared the impact of different word embeddings configurations for the CNN models: (1) Stanford GloVe embedding trained on billions of tweets in the general domain, (2) measles-specific embedding trained on the 1 million measles-related tweets, and (3) a combination of the 2 embeddings.
In short, the CNN model with the 2 embedding combination led to better performance on discussion themes and emotions expressed, while the CNN model with Stanford embedding achieved best performance on attitude toward vaccination.
In terms of discussion themes, nearly two-thirds (718/1,151, 62.38%) of tweets were categorized as resources (e.g., outbreak update or medical information about measles). Less than one-third (344/1,151, 29.89%) of the tweets were about users' personal opinions and interests. Only 1.82% (21/1151) of the tweets discussed personal experience with measles, and 1.73% (20/1151) asked questions. For emotions expressed, 79.84% (919/1,151) of tweets were categorised as expressing concern. Humour or sarcasm was found in 9.47% (109/1,151) of the tweets. Positive emotion and anger were found in 3.38% (39/1,151) and 3.04% (35/1,151) of the tweets, respectively. Finally, in terms of attitude toward vaccination, the majority of the tweets (913/1,151, 79.32%) did not express any opinion about vaccination, 17.55% (202/1,151) of tweets were provaccination, and 3.13% (36/1,151) were antivaccination.
The researchers conclude that the proposed scheme can successfully classify the public's opinions and emotions in multiple dimensions, which would facilitate the timely understanding of public perceptions during the outbreak of an infectious disease. Compared with conventional machine learning methods, the CNN models explored in this study showed superiority on measles-related tweet classification tasks. The proposed scheme and CNN-based tweets classification system could be useful for the analysis of tweets about other infectious diseases such as influenza and Ebola.
Journal of Medical Internet Research 2018, vol. 20, issue. 7, e236; and email from Cui Tao to The Communication Initiative on July 13 2018.
- Log in to post comments











































