Buyer beware: How and when to use Scope 3 emissions data

Part 3 of RI's Scope 3 Series: Machine learning and disclosures of prediction error for estimates could help improve Scope 3 data usability for investors.

chimneys on a blue green background

This article is the third in a five-part series on Scope 3 by Responsible Investor. The first article looked into how asset owners grapple with the topic, and the second covered corporate efforts. Look out for upcoming deep dives on how regulators and assurance providers are tackling the push to address indirect emissions across value chains.

Poor corporate disclosures have forced investors to turn to commercial providers for data on Scope 3 emissions.

But the same conditions have also proved a struggle for data professionals. A paper published in November aiming to quantify the differences in opinion between major data providers warns of “considerable divergence” in the Scope 3 values provided by ISS, compared with Refinitiv and Bloomberg.

ISS uses a proprietary estimation model, while Refinitiv and Bloomberg integrate company-reported data.

The paper found that data points provided by ISS would coincide with those from Refinitiv and Bloomberg only 22 percent of the time when ranked in order. Small but unanticipated variations were also observed between Scope 3 values from Refinitiv and Bloomberg, which were ranked identically in 82 percent of instances.

Company rankings can be instrumental for investors or service providers to determine which top emitters to divest, to tilt portfolios, or to select the constituents of a class-leading portfolio or index, says Griffith University academic Ivan Diaz-Rainey, one of the researchers behind the paper.

“Imagine you wanted to find the tallest kids in a classroom. Their specific heights are not of as much interest as compared to their relative position within the class – this is the same for investors looking to find leaders and laggards,” he says. “Using the same analogy, a league table on the height of schoolchildren produced by ISS would have the same ranking for only 22 percent of the positions if compared to Bloomberg and Refinitiv.”

“Overall, our findings imply that investors should be wary of the potential prediction errors when using Scope 3 emissions obtained from third parties,” he adds.

While unexpected, the disparity between Refinitiv and Bloomberg are indicative of another quirk of company reporting: even when companies choose to disclose their emissions, the sum of reported breakdowns do not always add up to the total reported carbon footprint and can be revised in later years.

The emissions disclosure platform CDP has estimated this to be the case for around 30 percent of companies.

The research was undertaken by Diaz-Rainey and a collaborative Australia-New Zealand team, consisting of Griffith University, UNSW and University of Otago academics and analysts from climate analytics firm EMMI.

In response to the paper, ISS ESG head Till Jung, says that the decision to use an estimation model was because “self-reported Scope 3 was so incomplete that it was widely unusable for investors across all sectors” prior to 2019. ISS ESG began including company reported data from FY 2020 as a rising number of corporates were starting to report the information, although Jung acknowledges that the quality of the disclosures still leaves much to be desired.

“We need to have the grown-up conversation about what we do with the Scope 3 data we do have. On the one hand, it’s too important to ignore, and on the other hand, it’s still too messy for portfolio attribution or for us to stick it into an investment product”

Jaakko Kooroshy, FTSE Russell

According to Jung, clients appreciated the reliability of reported Scope 3 which had been checked by ISS ESG, coupled with “high-quality modelled emissions where reported data does not meet the required standard”.

“Modelling will always only provide a feel for the order of magnitude, not exact data. However, with new disclosure requirements, we will likely see improved reported data in the future which in turn should improve models,” says Jung.

ISS ESG discarded around 70 percent of the reported Scope 3 emissions for FY2020 and FY2021 due to data quality reasons, the most common of which is the failure to include relevant Scope 3 categories.


Companies are increasingly disclosing their Scope 3 emissions but this information has not necessarily been useful for investors, says FTSE Russell investment research head Jaakko Kooroshy.

The problem is that companies persistently choose to report a certain activity not because it is material or relevant but because the data is easier to collect than other activities. This has led to a trade-off between user relevance and what researchers call completeness, or the ease of collecting information. There are a total of 15 activity categories for scope 3 emissions covering upstream and downstream activities.

“Only about one in five companies in the FTSE AllWorld disclose their most material Scope 3 emissions according to our data,” says Kooroshy.

Griffith University’s Diaz-Rainey and his team found that business travel was the most reported category, with 84 percent of companies offering up the information despite it accounting for less than one percent of total Scope 3 emissions. This is based on information between 2010-19 from Bloomberg, the only provider of the three in the scope of the study to provide Scope 3 category breakdowns.

In contrast, a highly material category such as use of sold products, which makes up 66 percent of overall scope 3 emissions, was reported by 18 percent of companies. The researchers estimated that the total indirect emissions for each company could be as much as 44 percent higher than reported by using carbon emissions data from industry peer groups as a proxy for unreported data.

This is not necessarily a ploy to underreport, notes Kooroshy. “Companies are often faced with lots of complexity when trying to assess Scope 3 emissions because they are taking place outside of your organisational boundary so they cannot be measured directly.

“Secondly, there is not a straightforward, highly standardised way to calculate scope 3 emissions, like for Scope 2, so two organisations could give very different emissions data for the same activity.”

Bring in AI

A potential workaround using machine learning to enhance the accuracy of scope 3 predictions has been proposed by the same research project involving Diaz-Rainey. “Normally finance researchers use regressions but there have been little interest in machine learning despite the clear potential for better predictions,” he says.

Its 2022 paper Scope 3 Emissions: Data Quality and Machine Learning Prediction Accuracy has been “incredibly impactful”, he says. The research won a prize and Diaz-Rainey says “we got word” that a European banking giant would “send people over to New Zealand to learn more about what we had done. Covid hit and that never happened but they did eventually manage to replicate our results”.

In broad terms, there are two widely used estimation approaches to fill in missing Scope 3 values, but each has shortcomings. The first uses financial flow data to reconstruct supply chains and subsequently industry and company carbon intensities. This method is only used for upstream activities. The second method applies activity-based analysis for sectors with widely reported production data by calculating the emissions associated with combusting a barrel of oil or a tonne of coal, for example, but is of limited use in other sectors.

“Overall, our findings imply that investors should be wary of the potential prediction errors when using Scope 3 emissions obtained from third parties”

Ivan Diaz-Rainey, Griffith University

In contrast, machine learning can be used for both upstream and downstream activities, and across the whole economy. It is also able to consider a wider range of different data points to establish the best predictors for a company’s emissions compared to human-led analysis, the paper says.

The team claims that machine learning algorithms can improve the prediction accuracy of overall Scope 3 emissions by up to 25 percent when each of the 15 emissions categories are estimated individually and then aggregated. However, absolute prediction performance is low even with the best models, with the accuracy of estimates constrained by low observations in specific Scope 3 categories.

Looking ahead

Given the disclosure difficulties, companies should be focusing on the most material categories first, says FTSE’s Kooroshy. He also calls for sector guidance on the issue.

“We think there should be some kind of consensus on what a telecoms company should focus on reporting, compared to an O&G company for example.”

This is because the fine print matters. Kooroshy refers to the example of copper, which can result in very different emissions profiles for carmakers when sourced from different producers such as Chile or Mongolia.

“Finally, we need to have the grown-up conversation about what we do with the Scope 3 data we do have. On the one hand, it’s too important to ignore, and on the other hand, it’s still too messy for portfolio attribution or for us to stick it into an investment product.”

It is not just companies which need to raise their game – data providers have a lot more to do on transparency. At best, data providers disclose an indicative confidence ranking associated with their Scope 3 estimates rather the absolute magnitude of their prediction error, a practice Diaz-Rainey’s research paper describes as “problematic”.

ESG data and machine learning firm ClarityAI wants to change this. Its chief sustainability officer Lorenzo Saa tells RI that in the next few months it will become the first data provider to give clients access to information on the degree of uncertainty of Scope 3 estimates and other indicators of reliability, such as whether a company has reported different values for the same data point.

The company already provides the information for Scope 1 and 2 emissions estimates.

“If you are given a level of confidence of estimations, you can then make informed decisions. If the confidence level of your data is extremely low, then maybe you will choose not to act on the data, but if it’s moderate or high, I think making decisions can be justified,” says Saa. “Let’s recognise that in financial accounting, estimates are used quite widely and no one seems to question it.”

“The bigger question is how close are we to real world values? And I think there is some reluctance to explore this in the sustainable finance industry.”