I decided it might be interesting to analyze the tweets of the main political players in the run-up to Super Tuesday. The methodology was simple, follow their official twitter accounts and then analyze the tweets for emotional word content. 1029 tweets from official twitter accounts for Bernie Sanders, Ted Cruz, Hillary Clinton and Donald Trump were collected. The collection started on February 23rd and continued through to midnight February 29th, the day before Super Tuesday. A breakdown of the volume of tweets per politician is shown on the right.
Obviously, there were a huge amount of other tweets relating to the individual politicians but I wanted to see what their core message was in terms of the use of emotionally related words. It’s of note that Cruz , the youngest of all the candidates analyzed , sent out the most number of tweets, followed by Sanders. Cruz was also the only one to use no capital letters in his screen name “tedcruz”.
Words were categorized using 8 types of emotion, which could overlap. The types were fear, anger, sadness, disgust, anticipation, surprise, joy and trust. It’s an approach which treats text as a “bag of words”, no attempt is made to parse the text for grammatical constructions in this case.
As an example I searched for the word “food” in a collection of 1.5 million random tweets. The search revealed 7319 tweets with the word “food” in them, the analysis of these tweets for emotional content is shown below:
This ring is like a pie chart, except its thickness is in proportion to the numbers of words which could be classed as emotional out of all the words in a tweet. The percentage in the center is the value for this, so 33% shows that 33% of words in the tweets with the word “food” in them could be put into one of our emotional types. This is a measure of how emotionally expressive the tweets were. The circle on the lower right shows the percentage of emotional words that are either classed as positive or negative within all the words classed as emotional. Blue means positive, red means negative. In way of contrast a search for the word “death” shows a very different result:
For our candidates, we see some subtle differences. Both Trump and Cruz share the same level of “positivity”, with a score of 59% but it would be useful to remember nothing is exact with any analysis of language.
Cruz uses fewer words that are classed as emotional words in total, yet with more tweets overall.
Turning to Clinton and Sanders we see this:
Here we see a different pattern. Levels of positivity are slightly above those of Cruz and Trump at 64% for both candidates with Clinton using less emotional words overall in a similar way to Cruz.
The top three emotional categories for Clinton and Sanders are trust, anticipation and joy. For Cruz the top two are trust and anticipation. In contrast, Trumps’ top two categories are trust and joy. Trump and Cruz also differ in the third highest. For Trump, his third category is sadness, for Cruz fear.
Bag of words approaches to text analysis are well established in the realm of content analysis of huge text collections. It’s interesting to see that they might have some application to smaller problems. The key point to remember is that this is one way of looking at text, there are many more. None of them can be said to be correct, it all depends how useful the results are.
Andrew Jeavons is Founder and CEO of Mass Cognition – a company that specializes in helping clients understand the deeper meanings in text and social media data. He was previously the CEO of Survey Analytics, a major survey software vendor serving Fortune 500 customers and the international community. He was one of the founders of e-tabs and has a (too) long history in the market research technology industry. He is a well-known award winning speaker and blogger.