Examining Sentiments to the Nigerian Twitter Ban

A couple of days ago I asked my LinkedIn network for their opinion on the Nigerian Government’s sudden and swift decision to ban Twitter. This was their response: –

Clearly the majority of people are against the ban, and you may count me as one of them!

I was curious to see what the media was saying concerning the ban, so I did some investigating!

The Data

News articles concerning the Nigerian Twitter Ban ranging from the 4th of June to the 12th of June 2021.

The Tools

Python programming language to retrieve news articles and score them for sentiments.

Pycharm IDE to develop my programming scripts.

Power BI to visualize the data into a nifty news dashboard.

KNIME to do some data prepping, cleaning, and transformation.

The Method

I queried the GoogleNews API, and used the Newspaper API to retrieve the news article data. Here is the code I used:

from GoogleNews import GoogleNews
from newspaper import Article
from newspaper import Config
import pandas as pd
import nltk
from datetime import date
from datetime import timedelta

# config will allow us to access the specified url for which we are #not authorized. Sometimes we may get 403 client error while parsing #the link to download the article.
### nltk.download('punkt')

user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'
config = Config()
config.browser_user_agent = user_agent

# Get today's date
Start = date.today()
print("Today is: ", Start)

# Get 7 days ago's date
End = Start - timedelta(days=11)
print("A week ago was", End)

#Convert to GoogleNews accepted date

Start = Start.strftime('%m/%d/%Y')
End = End.strftime('%m/%d/%Y')
print("Today is: ", Start)
print("A week ago was", End)

googlenews = GoogleNews(start=Start, end=End)
googlenews.search('Nigeria Twitter Ban')
result = googlenews.result()
df = pd.DataFrame(result)

for i in range(2, 20):
    result = googlenews.result()
    df = pd.DataFrame(result)
list = []

for ind in df.index:
        dict = {}
        article = Article(df['link'][ind], config=config)
        dict['Date'] = df['date'][ind]
        dict['Media'] = df['media'][ind]
        dict['Title'] = article.title
        dict['Article'] = article.text
        dict['Summary'] = article.summary
        dict['url'] = article.url
        print("Skipping this article")

news_df = pd.DataFrame(list)
news_df.to_excel(r'*file path goes here*', index = False)

I adapted this code from here!

My edits were to make the time dynamic and relative to the current date, include a logic to continue code execution in the event of article access errors, and of course to make the search parameter relevant to my use case.

I then sent the data over to KNIME for some data cleaning in preparation for sentiment analysis. Here is that workflow:

Primarily I filtered out blank data, removed numbers, punctuation and stop words from the article body. I then sent the data back to Excel, in a shape that was ready for sentiment analysis!

I recently discovered that I could have ran my python scripts directly in KNIME! This would have been much better! But…since I didn’t know this last Saturday, I imported my data into another Python script for sentiment analysis.

I first used the Textblob Python Library, but it made too many classification errors so I explored other options. I eventually settled for Afinn, which performed much better. Here is that code:

import pandas as pd
from afinn import Afinn

afinn = Afinn(language='en')

df = pd.read_excel(r'*your file path goes here*')

df['afinn_score'] = df['Article_Sentiment'].apply(afinn.score)


def word_count(text_string):
    '''Calculate the number of words in a string'''
    return len(text_string.split())

df['word_count'] = df['Article_Sentiment'].apply(word_count)



df['afinn_adjusted'] = df['afinn_score'] / df['word_count'] * 100


    r'*your file path goes here*',

I adapted this code from this article, which is a must read for anyone remotely interested in sentiment analysis!

Okay…so then I brought the data back to KNIME for some housekeeping 😁 Once I was satisfied, I presented the articles on this Power BI dashboard.

The dashboard allows you to filter the articles by recency, media, sentiment, and also text search!

Similar to my LinkedIn poll, media reaction to the ban is also negative with 79% of the articles tending towards a negative sentiment.

That being said, sentiment analysis libraries are not always 100% accurate, but I agreed with the vast majority of Afinn’s classifications.

Final Remarks

I would love to share the dashboard on here, but I have to refrain from doing so due to potential copywrite labilities. But if you want to do a similar project, feel free to reach out to me concerning developing the dashboard. It is extremely easy, trust me. 😉

Media listening is crucial for companies, and countries to maintain their brand image, and sentiment analysis plays a huge role in gaining insights towards brand image. The resources provided in this post should be useful to anyone who needs to perform similar task out of necessity, or curiosity 🤓

If you have any further questions about this project, or my data, feel free to get in touch.

That’s all for now guys!

See you on the blog!

-Tosin Adekanye-


Published by Tosinlitics

Hello! I'm Tosin and I love analyzing stuff and using data science as a crystal ball. Follow me to see my cool dashboards, data science, and analytics projects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: