Feature Extraction Using Principal Component Analysis (PCA) With Naïve Bayes Classifiers on Twitter Dataset
Abstract
Performing Sentiment Analysis on Twitter is complicated than doing it for large reviews. This is because the tweets are very short and mostly contain slangs, emoticons, hash tags and other twitter language. Feature extraction is the way toward building a feature vector from a given tweet. Every section in a feature vector is an integer that has a commitment on ascribing a supposition class to a tweet. The cycle of feature extraction is to remove the specific features to improve the accuracy of the classification model. In this paper we proposed Term Frequency-Inverse Document Frequency (TF-IDF) technique is to consolidate Principal Component Analysis (PCA) with Naïve Bayes Classifiers. For the classification work, the work proposed could create discrete factors from nonstop esteemed features from the Twitter dataset.