Finding Treasure in ESG Data

  • By Erez Katz
  • November, 22
Blog Finding Treasure in ESG Data

An Algorithmic Approach to Impact Investing

Erez Katz, CEO and Co-Founder of Neuravest Research Inc

Environmental, social and governance (ESG) investing has been the talk of the town for some time, and the boom is only going to continue. Over $100 trillion of assets under management are today committed to the Principles for Responsible Investment (PRI), while over a fifth of the world’s 2,000 largest public companies have pledged to meet net-zero targets, as sustainability shapes global markets in the 21st century. Naturally, astute asset managers have been quick to adapt and channel market interest towards new investment offerings such as ETFs, Indexes, and thematic portfolios. Unfortunately, many have been so wrapped up in the “Investment for Good ” paradigm that they have forgotten to recognize the conventional foundation of a sound investment. Generating profit! 

In today’s article, I’d like to demonstrate how investment for good, and investment for profit don’t necessarily need to be incompatible.

Why aren’t there more AI / ESG based portfolios?

There are two main obstacles to building profitable investment vehicles that are predicated on ESG:

1) ESG compliance is an expensive business and in many cases a compliance expense on the balance sheet “eats” into the bottom line and reduces the company’s profitability.

Reliable ESG data sources are scarce, sparse, and mostly cover recent history. One of the predominant challenges is that the method by which ESG scores are developed is frequently based on backward-looking information and biased, or greenwashed, datasets. The resulting incoherence between raters has been widely criticized.

2) Certainly not adequate for quantitative research. Not to mention deep learning models.

The good news is that with advances in artificial intelligence (AI) technology and data science, portfolio managers are now able to collect and combine traditional corporate earnings data with ESG information and construct portfolios that are both sustainable and generate profits. New technology is accelerating smarter analysis of data, and digitalization is enabling investors to have greater flexibility and customize data at lower cost.

Constructing a Thematic Portfolio Workflow

Like most businesses, we’ve converged on a scientific approach to bringing to market a thematic portfolio. 

Neuravest approach to systematic portfolio construction

Image 1: The birth of a new thematic portfolio workflow

    1) Idea generation: It starts with an idea formed by a subject matter expert’s intuition.

    2) Data Identification and validation: Once the idea has been formally documented and accepted, we move swiftly into identifying existing and new datasets that can be useful. Our DAS (Data Analysis Suite) platform supports a comprehensive workflow by which we determine if a dataset is indeed predictive and conducive to the strategy’s objectives.

    3) Model Building and Backtesting: With well formed data and engineered features, we can now construct models and backtest them in-sample, out-of sample, and perpetually via paper trading simulation. 

    4) Go To Market: Only after we’ve gone through a thorough iterative process of validation, adjustment etc., we’re ready to deploy capital (initially small) and increase exposure with success.

Our Hypothesis: 

Identify the top ESG compliant companies in the S&P 500 ordered by market cap, and further overlay a classification engine (AI based) to identify the top performers based on their fundamental features. 

 One approach to identifying “winners” within similarly behaving stocks is an unsupervised learning technique called Clustering.

Unsupervised learning clustering technique.

Image 3: Clustering used to group similarly behaving stocks based on predetermined factors. In the case of ESG we could use for example: Market Cap and seasonally adjusted Carbon Emission readings.

The idea is to identify clusters of ESG compliant factors and ultimately overlay fundamental models on each group to spot the “winners”. In other words, forecast which business is set to outperform within each compliance cluster.

The Data:

ESG data remains a challenge for asset managers as it is typically unstructured, unstandardized and unaligned. To address this, most large asset managers use more than one dataset for their ESG needs. According to Ernst and Young, 62 of the largest asset managers use between 2 and 5 different providers and some even use up to 10 different third-party vendors to cover their ESG data needs. 

 We believe many asset management firms, particularly those without a multi-million dollar data and quantitative analytics budget, can do better by outsourcing the construction of ESG-optimized portfolios to specialists such as Neuravest. Neuravest has partnered with two validated ESG data providers, both with distinct advantages – Arabesque and Owl Analytics. For each dataset, there is a clearly defined and distinct business rationale. As we will showcase below, Neuravest has extracted engineered features from each dataset that are used to construct two differentiated portfolios. 

 In addition to unique features, both firms have the following attributes that may appeal to prospective clients:

     – Mission-Driven: Arabesque and OWL Analytics are both native ESG and climate-focused firms. Both have roots going back many years with a sole focus on ESG, Sustainability and Impact. This contrasts with many ESG data vendors who exist as branches of larger data organizations.

     – Quantitative: Both firms employ a quantitative and data driven approach to ensure more capital is allocated to companies at the forefront of the movement. 

     – Depth and Breadth: Each firm has broad coverage and frequently updated data. 

Let’s take a closer look at each dataset and the features extracted. 

Arabesque S-Ray One of the world’s largest independent ESG data and technology providers. A leader in some of the largest global institutions and investors use S-Ray’s ESG metrics and raw emissions data, which covers companies across the world’s major stock indices, and includes business involvement filters for over 25,000 companies. Using big data and a quantitative, algorithmic approach, Arabesque’s capabilities draw on more than four million ESG data points daily from over 30,000 sources for performance measurements on sustainability, including corporate net-zero alignment.

The Arabesque-Based Models 

To construct a portfolio, Neuravest selected the Arabesque S-Ray Temperature Score which measures the extent to which corporations are contributing to the rise in global temperature. Each company is given a score of 1.5C, 2C, 2.7C or >2.7C which represents the increase in global temperature if every other company were to behave like them. The scores are based on companies’ emissions intensity ratio calculated as the amount of greenhouse gas emissions a company emits per dollar of revenue.

Backtest – Arabesque  

Arabesque ESG Portfolio Backtest

Note: Performance excludes AI Classification overlay as well as transaction costs and performance fees. Backtest simulation. Past performance is not indicative of future returns 

OWL Analytics is an ESG data company that provides ESG ratings that employ a “wisdom of the crowd” approach to reduce the well-known inherent subjectivity of single viewpoint ESG ratings. To do this, OWL aggregates hundreds of sources of ESG data, research, and ratings with the goal of identifying which ESG metrics each source deems relevant for each industry. OWL then rates each company in an industry across a number of high-level metrics (KPIs) based on that industry’s consensus ESG model. We focus on the high level E, S and G scores in an industry neutral fashion and we then create a blended ESG strategy.

 The OWL-Based Models 

 We designed signals that combine the absolute E, S, G scores with the relative change in these scores from a year earlier. This way we select companies with the best sustainability profiles which have also improved their scores the most in the last 12 months. The AI classification engine further narrows the selection of stocks that enter into the portfolio. The portfolio construction technique then optimizes the turnover in the portfolio such as to include the stocks with the highest expected return while targeting turnover level that most investors would find reasonable.

Backtest – OWL

OWL Analytics ESG based Model Portfolio

Note: Performance excludes AI Classification overlay as well as transaction costs and performance fees. Backtest simulation. Past performance is not indicative of future returns.

Model Results

The results above are backtest simulations harnessing specific features from Arabesque and Owl Analytics datasets. We employ our proprietary modeling engine to generate the results – Neuravest’s DAS (Data Analytics Services) platform. We were able to empirically validate how actionable ESG data is for turnover constrained, high Sharpe ratio investment strategies. Further, to further validate our models we will be perpetually trading our models into the future in order to ensure no look-forward bias.

Through our strategic data partnerships with Owl Analytics and Arabesque, we are able to work closely with institutional asset managers to build an investment portfolio unique to their firm – focused on specific dynamics such as carbon emissions and alignment with temperature goals, employee satisfaction, workplace safety, board diversity, human rights — or all of the above.


Simplistic ESG 1.0 generation is coming to an end; ESG 2.0 will be transparent and data-driven. To gain an edge, smart investors will look to technology-based and transparent models with ownership of ESG raw data as differentiators, delivered through SaaS solutions.  

We are now seeing new Data and AI technologies emerge in not only extracting and dissecting ESG compliance data but rather helping to reinforce it. Ultimately, the ability to pull intelligence from various sources of structured and unstructured data can sharpen the transparency and accuracy of corporate ESG performance. 

With Arabesque and Owl Analytics, Neuravest has two ESG data partners in the unprecedented race towards sustainability, driven by investor commitments, regulation, and real economy changes. 

To learn more about Neuravest ESG Climate Pledge portfolios visit our website here.

Have a Happy Thanksgiving!

Erez M. Katz

Have a media inquiry or a topic you’d like to contribute to our blog?