Twitter Sentiment Analysis

Learn via video courses
Topics Covered

Overview

Sentiment Analysis is an application of Natural Language Processing (NLP) where we predict the emotion or sentiment of the text given. Sentiment Analysis can be used in determining the aggregated reviews of a product without actually reading them or detecting the tone of a text. In this article, we will look over Twitter Sentiment Analysis. Twitter is a central social platform that generates a ton of text data daily, which keeps buzzing with all kinds of tweets, helping us analyze people's opinions globally around diversified topics. Based on this analysis, we can determine if the tweet is positive, negative, or neutral based on the keywords in the tweet.

What are We Building?

We are building a twitter sentiment analysis project using Natural Language Processing that classifies text as either positive or negative.

Pre-requisites

  • Implementable knowledge of SciKit - An open-source library containing tools to help us with basic implementations of models and algorithms
  • Understanding of NumPy, Pandas, and NLTK libraries for data processing and manipulation.
  • Knowledge of Seaborn and Matplotlib for Data Visualisation

How Are We Going to Build This?

Let us go over some steps to build our project -

  1. First, we import the required libraries and create a dataframe.
  2. Once we have the dataset, we analyze the data, like the null values, variables, columns, and data type.
  3. Now that we have identified the dataset, we visualize the dataset and the target values.
  4. We must remove unnecessary text, punctuation, and symbols to work with our textual data. This is where stemming, lemmatization, and regex come into play.
  5. To work with data, we need to split it into training and testing data.
  6. Next, we build feature vectors using TF-IDF Vectorizer.
  7. We can now use the data and vectors on our model.
  8. To see the results, we can visualize them using matplotlib.

Final Output

The result of this project is a model analyzing positive and negative comments on Twitter.

Building the Twitter Sentiment Analysis

a. Prerequisites

Output -

b. Load the Dataset

Output -

c. Exploratory Data Analysis

Output -

Output -

Output -

d. Data Visualization of Target Variables

Output -

output-data-visualization-of-target-variable

Output -

e. Data Preprocessing

Output -

Output -

Output -

Output -

Output -

Output -

Output -

Output -

Output -

output-printing-plot

Output -

output-positive-word-cloud

f. Splitting Our Data Into Train and Test Subset

g. Transforming Dataset using TF-IDF Vectorizer

Output -

h. Analyze Your Twitter Data Using Your Sentiment Analysis Model

i. Visualize the Results of Your Twitter Sentiment Analysis

Output -

 precision  recall  f1-score  support 00.780.760.7711994610.770.790.78120054 accuracy 0.77240000 macro avg 0.770.770.77240000 weighted avg 0.770.770.77240000\begin{aligned} &\begin{array}{rrrrr} & \text { precision } & \text { recall } & \text { f1-score } & \text { support } \\ 0 & 0.78 & 0.76 & 0.77 & 119946 \\ 1 & 0.77 & 0.79 & 0.78 & 120054 \\ \\ \text { accuracy } & & & 0.77 & 240000 \\ \text { macro avg } & 0.77 & 0.77 & 0.77 & 240000 \\ \text { weighted avg } & 0.77 & 0.77 & 0.77 & 240000 \end{array}\\ \end{aligned}

output-confusion-matrix

Output -

output-importing-metrices

Output:

 precision  recall  f1-score  support 00.820.790.8012049410.800.820.81119506\  accuracy 0.81240000 macro avg 0.810.810.81240000 weighted avg 0.810.810.81240000\begin{array}{rrrrr} & \text { precision } & \text { recall } & \text { f1-score } & \text { support } \\ 0 & 0.82 & 0.79 & 0.80 & 120494 \\ 1 & 0.80 & 0.82 & 0.81 & 119506 \\ \\\ \text { accuracy } & & & 0.81 & 240000 \\ \text { macro avg } & 0.81 & 0.81 & 0.81 & 240000 \\ \text { weighted avg } & 0.81 & 0.81 & 0.81 & 240000 \end{array}

output-confusion-matrix2

Output -

output-roc-curve

Why Is Sentiment Analysis Important?

Sentiment analysis is critical for understanding consumer sentiment and making data-driven decisions. By analyzing text data from various sources, such as social media, reviews, and surveys, businesses can gain valuable insights into public perception of their products, brand, and industry. This information can be used to improve marketing strategies, customer service, and overall business performance.

Additionally, sentiment analysis can also be used to identify potential issues and opportunities and track campaigns' effectiveness over time.

Why Use Twitter for Sentiment Analysis?

Twitter is a valuable tool for sentiment analysis due to its vast amount of real-time, user-generated content. With over 330 million monthly active users, it offers a diverse sample of opinions and emotions. Analyzing tweets and their associated metadata allows us to gain valuable insights into public perceptions and make data-driven decisions. It's a powerful tool for businesses, organizations, and researchers to understand public sentiment.

What's Next

A few other things that can be further implemented are given below-

  • Building an interface using Flask
  • Changing the dataset
  • Analysing other emotions
  • Including a neutral emotion
  • Using hashtags and fetching relevant data, and then sorting it

Conclusion

  • Sentiment analysis is one of the most common applications of natural language processing.
  • Twitter can be considered a source of sizeable textual data for analysis.
  • Understanding consumer sentiments helps us understand a business better, quicker, and more efficiently.