20 hours ago Justin Trudeau had formerly worked as a bouncer before he was the Canadian Prime Minister. Ryan Remiorz/The Canadian Press via AP Justin Trudeau. The latest tweets from @JustinTrudeau.
NPR’s sites use cookies, similar tracking and storage technologies, and information about the device you use to access our sites (together, “cookies”) to enhance your viewing, listening and user experience, personalize content, personalize messages from NPR’s sponsors, provide social media features, and analyze NPR’s traffic. This information is shared with social media, sponsorship, analytics, and other vendors or service providers. See details.
You may click on “Your Choices” below to learn about and use cookie management tools to limit use of cookies when you visit NPR’s sites. You can adjust your cookie choices in those tools at any time. If you click “Agree and Continue” below, you acknowledge that your cookie choices in those tools will be respected and that you otherwise agree to the use of cookies on NPR’s sites.
Claim: In 2001, Justin Trudeau left his teaching job at the West Point Grey Academy after having a sexual relationship with either a student or a student's mother. The latest tweets from @justintrudeau. Justin Trudeau (born December 25, 1971) is Canada’s 23rd Prime Minister. His vision of Canada is a country where everyone has a real and fair chance to succeed. His experiences as a teacher, father, leader, and advocate for youth have shaped his dedication to Canadians. The oldest of three boys.
During the COVID-19 pandemic, people take their worries, concerns, frustration, and loves to social media to share with the rest of the world about their feelings and thoughts. Twitter has become one of official channels where world leaders communicate with their supporters and followers. To understand what keep them busy, we extract tweets of two world leaders, Donald Trump (the President of United States) and Justin Trudeau (the Prime Minister of Canada). By applying natural language processing techniques and Latent Dirichlet Allocation (LDA) algorithm, topics of their tweets can be learned. So we can see what is on their mind during the crisis.
We use Python 3.6 and the following packages:
- TwitterScraper, a Python script to scrape for tweets
- NLTK (Natural Language Toolkit), a NLP package for text processing, e.g. stop words, punctuation, tokenization, lemmatization, etc.
- Gensim, “generate similar”, a popular NLP package for topic modeling
- Latent Dirichlet Allocation (LDA), a generative, probabilistic model for topic clustering/modeling
- pyLDAvis, an interactive LDA visualization package, designed to help interpret topics in a topic model that is trained on a corpus of text data
We use TwitterScraper to scrape tweets from Twitter handle @realDonaldTrump and @JustineTrudeau. Only original tweets that are posted from March 1 to April 27, 2020 are collected, no retweet of others. It is English only.
Number of tweets by Week Day and Hour
It seems Trump likes to tweet from 1 to 4 pm, while Trudeau likes to tweet around 3 pm.


Both Trump and Trudeau tweet regularly during the week. It seems Trump likes to tweet even more on Sundays!
Tweet Length
From March 1 to April 27, 2020, Trump made 673 tweets, with an average of 27 words in a tweet, and Trudeau made 386 tweets, with an average of 41 words in a tweet. Trump had many short tweets (less than 10 words) and some lengthy tweets (over 40 words). Trudeau had most tweets with 40 to 50 words.
Data Pre-processing
Text pre-processing is needed for transferring text from human language to machine-readable format for further processing. The following pre-processing steps are applied to our Twitter texts.
- Convert all words to lowercase
- Remove non-alphabet characters
- Remove short word (length less than 3)
- Tokenization: breaking sentences into words
- Part-of-speech (POS) tagging: process of classifying words into their grammatical category, in order to understand their roles in a sentence, e.g. verbs, nouns, adjectives, etc. POS tagging provides grammar context for lemmatization.
- Lemmatization: converting a word to its base form e.g.
car, cars, car’stocar - Remove common English words e.g. a, the, of, etc., and remove common words that add very little value to our analysis, e.g. com, twitter, pic, etc.
Trudeau Twitter Biden
We extract both unigrams and bigrams (pairs of consecutive words ) from the texts. After pre-processing, our tweets look like this:
Word Count and Word Cloud
We use bigrams for our word count and word cloud as bigrams provide more meaningful insights than single word.
Top 5 mostly common words in Trump’s tweets are: fake news, white house, united state, news conference, mini mike.
Top 5 mostly common words in Trudeau’s tweets are: make sure, across country, keep safe, canada emergency, health care.
Here is the word cloud of Trump’s tweets:
Here is the word cloud of Trudeau’s tweets:
In the next post, we will show how to generate meaningful topics of the tweets by applying LDA algorithm.

Prime Minister Trudeau Twitter
Happy Machine Learning!
