Purpose
Research analyzing online travelers’ reviews has boomed over the past years, but it lacks efficient methodologies that can provide useful end-user value within time and budget. This study aims to contribute to the field by developing and testing a new methodology for sentiment analysis that surpasses the standard dictionary-based method by creating two hotel-specific word lexicons.
Design/methodology/approach
Big data of hotel customer reviews posted on the TripAdvisor platform were collected and appropriately prepared for conducting a binary sentiment analysis by developing a novel bag-of-words weighted approach. The latter provides a transparent and replicable procedure to prepare, create and assess lexicons for sentiment analysis. This approach resulted in two lexicons (a weighted lexicon, L1 and a manually selected lexicon, L2), which were tested and validated by applying classification accuracy metrics to the TripAdvisor big data. Two popular methodologies (a public dictionary-based method and a complex machine-learning algorithm) were used for comparing the accuracy metrics of the study’s approach for creating the two lexicons.
Findings
The results of the accuracy metrics confirmed that the study’s methodology significantly outperforms the dictionary-based method in comparison to the machine-learning algorithm method. The findings also provide evidence that the study’s methodology is generalizable for predicting users’ sentiment.
Practical implications
The study developed and validated a methodology for generating reliable lexicons that can be used for big data analysis aiming to understand and predict customers’ sentiment. The L2 hotel dictionary generated by the study provides a reliable method and a useful tool for analyzing guests’ feedback and enabling managers to understand, anticipate and re-actively respond to customers’ attitudes and changes. The study also proposed a simplified methodology for understanding the sentiment of each user, which, in turn, can be used for conducting comparisons aiming to detect and understand guests’ sentiment changes across time, as well as across users based on their profiles and experiences.
Originality/value
This study contributes to the field by proposing and testing a new methodology for conducting sentiment analysis that addresses previous methodological limitations, as well as the contextual specificities of the tourism industry. Based on the paper’s literature review, this is the first research study using a bag-of-words approach for conducting a sentiment analysis and creating a field-specific lexicon.