Improvised Sentiment Analysis using One Hot Encoder and N-grams

  • Sandeep Yelisetti, Dr. Nellore Geethanjali

Abstract

In the field of text mining, sentiment analysis of free-text documents can be a common task. Text documents are assigned predefined sentiment labels such as "positive" and "negative" in sentiment analysis. Sentiment analysis, which identifies people's sentiments underlying a text, is an important area of research. Data mining is a popular field that studies sentiment analysis. It is well-studied to analyze sentiments in tweets. The proposed method seeks to improve the effectiveness of sentiment analysis of tweets by reviewing the existing research. The ever-growing popularity of social media and online marketing sites makes it an invaluable resource for analysis and decision-making. The majority of these reviews are not structured by nature, so they need to be processed like clustering or classification to provide useful information for future use. This paper aims to improve the classification of tweets with sentiment information. One-hot encoder and N–grams scheme are used. They use the sentiment lexicons to extract tweets with n-grams of high-performance gain. These include neutral, positive or negative. For the classification of human sentiments, four machine learning algorithms were considered: Logistic regression (SVM), Support Vector Machine (SVM), Random Forest (Raw Forest), and Decision Tree. To assess the accuracy of each method, we examine their performance using parameters like precision, recall, accuracy, and f-measure.

Published
2021-09-05
How to Cite
Sandeep Yelisetti, Dr. Nellore Geethanjali. (2021). Improvised Sentiment Analysis using One Hot Encoder and N-grams. Design Engineering, 11037 - 11047. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/4100
Section
Articles