Michelle Michalowski, a Master in Business Analytics & Big Data graduate, put theory to practice with her teammates to create an application that can give sound financial trading advice using sentiment analysis of a single reddit forum.

3 min read

Born and raised in western Germany but with Romanian roots, Michelle Michalowski studied economics with a major in statistics at the University of Mannheim before coming to Madrid to study the Master in Business Analytics & Big Data. Apart from being interested in tech and data, she is also a fitness junkie and a travel lover. During her program, she and her classmates, with the support of professor Angel Castellanos Gonzalez, worked together to build an app to offer trading advice. She shared her story with us.  

Gamestop, a retail chain that sells computer games, has struggled over the years as the gaming industry shifted online and gamer stores started doing worse and worse financially. GameStop’s stock has consistently fallen, especially since the COVID-19 pandemic.

But in early 2021, something very unexpected happened. The price of the stock exploded while hedge funds were betting against it, and all this happened due to a single subreddit (reddit forum) called Wallstreetbets. Private investors and reddit users alike pumped money into GameStop shares, driving the value up exponentially. 

How to extract financial trading advice using sentiment analysis… on a subreddit

This is just one example of how the sentiments of a single community can dictate a stock’s performance. Inspired by this event and other similar stories, as well as by the work we were doing in the Master in Business Analytics & Big Data program, I had the idea to start a project with some of my classmates from our Natural Language Processing (NLP) course. Together, using our newly gained skills, we spent three weeks working on a tool that is able to analyze the sentiments for different stocks, as well as provide a concrete trading strategy.

Sentiment analysis: how it works

So how does it work? In the first step, we ingested data from Wallstreetbets using a reddit Application Programming Interface (API). After that, we extracted the so-called “cashtags” from the text, which show which stock a certain comment is referring to. After some basic data cleaning, we were ready to analyze the sentiment on each stock discussed on the forum.

Here, we used two different approaches we had been working on during our program: one based on deep learning and one based on a lexicon. The latter is the oldest and simplest method. It relies on a lexicon that assigns each word a polarity score, representing whether it has a positive, neutral or negative connotation. Here, we used an open-source, rule-based sentiment analysis tool specifically trained using sentiments expressed on social media.

Of course, the language used on r/Wallstreetbets doesn’t correspond to “proper” financial language—it’s mostly slang. To account for that, we enhanced the default lexicon by incorporating a list of 300 slang words and emojis, which significantly increased our performance. 

After running our sentiment analysis with both approaches, we gathered a data set of around 40,000 entries identifying different sentiments for the stocks. Continuing from here, we tried to somehow validate how well our approach worked. We did so by implementing a backtesting strategy, which buys a stock if the number of positive sentiments increases compared to the previous day, and sells it if the number of positive sentiments decreases.

This is what we got:

How to extract financial trading advice using sentiment analysis… on a subreddit
How to extract financial trading advice using sentiment analysis… on a subreddit

The motivation behind our app was to help people invest more efficiently. The project served as an excellent learning experience and a way for our group to put theory to practice. The Master in Business Analytics & Big Data gave us the tools, skills and confidence necessary to bring our idea to life, and we felt very supported by both the university and our NLP professor every step of the way.