By Erez Katz, CEO and Co-Founder of Lucena Research
In June of 2020, Lucena partnered with Benzinga to evaluate whether an AI approach to news feed sentiment is predictive for active investment. It is widely agreed that news media is the most efficient way to convey information to the masses. Naturally, financial news is the most commonly used source of information for high latency investors. In recent years however, with the rapid advancement of NLP (natural language processing) technology, many investors feel left behind. Large and sophisticated hedge funds invested many millions to enable them to ingest news in bulk and respond to it instantaneously. For most investors however, by the time a stock-moving news gets to their desktop, the price of the underlying stock has already changed. (See Eugene Fama’s efficient market hypothesis theory.)
Image 1: A wide array of datasets powering through Lucena’s DAS (Data Analysis Services) platform.
Our approach to extracting ticker sentiment from news articles
There are quite a few prepackaged python libraries specific for NLP and most work almost magically out of the box. However, applying NLP for investment remains a challenge since most NLP libraries are designed to support language translation or content creation and not necessarily to extract sentiment out of free text. The good news is that in contrast to most financial data which are typically scarce, news content is more abundant. More specifically, a growing lexicon of financial terms has already been created and made available through transfer learning. Transfer learning is the process of taking pretrained models as a baseline to train new models geared towards solving different problems.
Our overarching approach was to generate an aggregated score by ticker per day. In addition, we have incorporated a special algorithm to enable us to carry forward sentiment over time into the future. In other words, a news article can still hold value for days to come. While our approach sounded pretty straight forward, we still faced a few interesting challenges and our ability to find solutions was instrumental in achieving the results presented below.
Challenge # 1
How to distinguish between multiple tickers covered in a single article?
Many news articles cover multiple tickers. It is often very difficult to assign individual sentiment scores to each ticker in an article.
Image 2: How to distinguish sentiment between multiple tickers in a single new article.
Our solution was twofold:
- Incorporate contextual reference to the nearest ticker and compare its sentiment score to that of the entire article. If there is a distinct difference, we want to assign a sentiment per ticker.
- In most cases, such distinction doesn’t exist, so we simply divide the total score evenly between the tickers.
Challenge # 2
How to carry forward a news sentiment past the date of publication?
A meaningfully surprising news story, such as a company filing for Chapter 11 bankruptcy, is analogous to throwing a rock into a pond: the point of impact generates concentric circles, gradually diminishing as they expand. We wanted to find a way to measure the news impact after the initial shock. I am certain that there are plenty of human psychology studies geared to measure the expected carry of shocking news. One way to study it is to examine active investment blogs, such as StockTwits, and simply train a model based on the number of negative or positive twits over time that reference the original news. For maximum efficiency and time to market, we took a simpler yet still very useful approach.
A cumulative linear decay is a mathematical approach to measuring the carry sentiment of an article over time. This approach not only takes the linear decaying value of a given sentiment, but also looks at additional articles on the very subject and accumulates their score as a measure of accelerated or decelerated decay.
Image 3: Cumulative linear decay. Applying average/max linear decay Method. Calculation includes forward fill + days since method.
Which news categories are most predictive and how can we measure the strength of our models?
With so many news categories, there must be a distinction based on the type of news and its respective impact on stock prices. We needed to isolate the most impactful categories for our investment objectives. We found the following categories to be most actionable:
One of the methods we used to distinguish between news categories is to compare identical models statistically through ROC curves. I wrote about confusion matrices and ROC curves at length in previous articles, but for the purpose of today’s writeup, we use confusion matrices to visualize the separability of true positive and true negative classifications at different thresholds.
Image 4: If the compound sentiment score is in [-1, 1], a natural assumption for the threshold might be 0.0; which is not necessarily optimal.
I wanted to showcase some of the challenges our quants and data scientists face daily in the course of our research. Having a set of python libraries and running code doesn’t always get you to the Promised Land. You need to continuously validate hypotheses and innovate.
Our platform, DAS, is designed to take the past eight years of best practices and pack them into reusable code. DAS not only statistically validates data, but also constructs investment models and empirically validates them through backtesting and paper-trading simulations. Our goal is always to scientifically validate investments before risking capital.
Below, please find both a backtest and a perpetually traded model of Benzinga’s news sentiment.
Benzinga New feed Sentiment live: View live: QuantDesk platform
Image 5: Benzinga news feed sentiment is a long only model predicated on aggregate positive sentiment per stock within the S&P 500 & Russell 1000. We started to track this portfolio “live” via paper trading simulation on 10/2/2020.
- Past performance is not indicative of future returns.
Benzinga New feed Sentiment backtest: View backtest
Image 6: Benzinga news feed sentiment long only backtest. Out-of-sample period started in May of 2019.
- Past performance is not indicative of future returns.
As you can see, both the backtest and the paper-traded portfolios beat their respective benchmarks in absolute return, lower volatility and Sharpe ratio.
Next week, I will be covering another exciting set of model portfolios that are based on a novel approach to measuring earnings data. More specifically, we will learn about leveraging cutting-edge machine learning technology to extract a company’s true earnings power, and how this approach compares to legacy metrics from traditional data firms, reported GAAP values, as well as consensus.
I will also be introducing our data partner, New Constructs, whose expertise is in extracting earnings distortions from footnotes and complex disclosures in corporate filings.
If you are a data provider with unique data that could be useful for investment, we want to talk to you. In addition, if you are an investment professional looking for winning investment portfolios, feel free to reach out to us. We’re happy to grant you trial access to a model portfolio that suits your investment style and mandate.
Have a great week!
Erez M. Katz