Leveraging Twitter Data to understand Sentiments on New Income Tax Reforms in India
Abstract
The introduction of Goods and Service Tax (GST) was one of the biggest economic reforms in India. It leads to huge debate. Social networking websites provided a platform to general public for sharing their opinion on GST. The analysis of these opinions to understand the public sentiment will prove helpful for future reforms. Motivated from this fact, we collected GST related tweets from April 30, 2018 until May 1, 2018 and analyzed their polarity. We investigated five different classifiers: Linear Regression, Logistic Regression, Decision Tree, SVM, and Random Forest. For each classifier test runs were conducted using uni-gram and bi-gram feature a combination of both and their performance. The overall maximum accuracy of 94.12% was observed using random forest with a combination of unigram and bigram feature. The performance comparison is done with other existing work on GST. In order to assess the effect of feature combination on other dataset a test run is conducted on open source product review datasets. Random Forest using combination of uni-gram and bi-gram features exhibits better performance than other existing works on the same dataset.