Scope 3 Is the Hardest Part of Carbon Reporting — AI Makes It Easy

A senior sustainability manager at a €2B industrial firm told me last month that her Scope 1 and 2 inventory takes three weeks. Her Scope 3 inventory takes seven months. The Scope 3 number is also roughly twelve times larger. She is not an outlier. For most companies that actually measure, Scope 3 accounts for somewhere between 60% and 90% of total emissions, and yet it consumes the overwhelming majority of the reporting effort — usually for a number that everyone on the team privately suspects is wrong by a factor of two.

That mismatch is the entire story of Scope 3. The biggest part of your carbon footprint is the part you don't own, can't see, and have to triangulate from invoices, supplier emails, and industry averages that were last updated when George W. Bush was president. Everything else flows from that.

What Scope 3 actually is, in two sentences

Scope 3 is every greenhouse gas emission that occurs in your value chain but outside your direct operational control — the steel you bought, the trucks your distributor hired, the electricity your customer used to charge the product you sold them. The GHG Protocol Scope 3 Standard splits it into 15 categories, 8 upstream and 7 downstream, and tells you to report any category that is material.

The word "material" is doing an enormous amount of work in that sentence, which we'll come back to.

Why it's disproportionately painful

Scope 1 is fuel you burned. You have a fuel invoice. Scope 2 is electricity you bought. You have a utility bill. Both live inside your own accounting system, in units you understand, from counterparties who send you monthly statements because they want to get paid.

Scope 3 is different. The emissions happen at a supplier's factory in Gujarat, or in a container ship somewhere between Ningbo and Rotterdam, or in the dryer of a consumer who bought a T-shirt from your brand in Düsseldorf. You have no meter. You have no bill. You have, at best, a procurement line item that says "€142,000 of packaging film" and a vague memory that the vendor is in Italy.

To turn that into a kgCO2e number, you need one of three things: actual emissions data from the supplier, activity data (tonnes, kilowatt-hours, kilometers) from the supplier, or an emission factor per dollar spent. The first is rare, the second is partial, the third is a polite fiction — and we'll get to why.

The 15 categories, honestly

Not all categories are created equal. Some are arithmetic. Some are research projects. Here's the unvarnished breakdown of the upstream eight:

Purchased goods and services. The monster. Usually the single largest line for manufacturers, retailers, and anyone who buys physical stuff. Requires per-product or per-spend emissions data for thousands of SKUs.
Capital goods. Same math as Category 1, but for your CapEx line — buildings, machinery, IT hardware. Painful but one-off per asset.
Fuel- and energy-related activities not in Scope 1/2. Upstream emissions from extracting and transporting the fuels you already counted. Relatively easy: grid-level factors, a spreadsheet, done.
Upstream transportation and distribution. Inbound freight your suppliers paid for on your behalf. Medium difficulty — your logistics team has most of the data, but fuel types and modes need rebuilding.
Waste generated in operations. Usually small. Waste hauler invoices plus disposal-method factors. A quiet win.
Business travel. Expense reports, corporate travel tool APIs. Tractable if your finance team uses one booking platform, hellish if they use six.
Employee commuting. Surveys and national average factors. Nobody pretends the number is precise.
Upstream leased assets. Edge case for most companies. If you rent warehouses or vehicles and the landlord pays the energy bill, this belongs here.

And the downstream seven:

Downstream transportation and distribution. Freight your customer paid for. Often similar in size to upstream, rarely measured with the same rigor.
Processing of sold products. Relevant if you sell intermediate goods — chemicals, steel coils, yarn. Requires knowing what your customer does next.
Use of sold products. The monster's downstream cousin. For anyone who sells cars, appliances, HVAC, or electronics, this category dwarfs everything else. Multiply energy-use-per-unit by expected lifetime by units sold. The assumptions you make about lifetime and usage pattern drive the result by 3x.
End-of-life treatment of sold products. Waste category, but at massive scale. National recycling and landfill rates. Estimation all the way down.
Downstream leased assets. Edge case — mirror of Category 8.
Franchises. Edge case unless you're McDonald's.
Investments. The monster for banks, asset managers, and insurers. PCAF methodology, financed emissions, whole separate universe.

For most industrial and retail companies, Categories 1, 4, 9, and 11 are where 80% of the footprint lives. Category 1 is where 80% of the work lives.

Category 1 is the one that eats the year

Say you're a mid-size beverage company. You buy aluminum cans, cardboard trays, shrink film, sugar, flavor concentrate, glass bottles, CO2, pallets, cleaning chemicals, lubricants, and about 4,000 other line items from 600 suppliers across 30 countries. Your procurement export from SAP has 140,000 rows for the year.

To compute Scope 3 Category 1 honestly, you need an emission factor — in kgCO2e per physical unit, ideally — for every one of those line items. You have three realistic methods.

Spend-based

Take each procurement line item, classify it by industry code (NAICS, CPA, ISIC, or the EEIO sector equivalent), and multiply euros spent by an environmentally-extended input-output factor (kgCO2e per euro of output from that sector). The US EPA's USEEIO database, Exiobase, and the Quantis Scope 3 Evaluator all work this way.

Pros: you can do it in a week. Every company with a procurement ledger can produce a number. Cons: the factor is an industry-wide average that assumes your aluminum was produced like everyone else's aluminum. For a supplier using hydroelectric smelting in Iceland versus coal smelting in Shandong, the real difference is roughly 10x. The spend-based factor gives you one number for both. More on this in a minute.

Activity-based

Get physical quantity data — tonnes of aluminum, litres of concentrate, square meters of film — and multiply by a per-unit emission factor from Ecoinvent, GaBi, or a product-specific LCA database. This is what the GHG Protocol calls "average data" if the factor is industry-typical, or "supplier-specific" if the factor came from the supplier itself.

Pros: far more accurate for categories where material type dominates (metals, plastics, chemicals). Cons: your procurement system doesn't store tonnes — it stores euros and part numbers. Translating 140,000 line items into physical units requires either a master data project or a lot of phoning around.

Hybrid

Use activity-based for your top 20 or 30 commodities by spend, spend-based for the long tail. This is what most serious companies actually do. It's also what the GHG Protocol implicitly recommends: focus data quality effort where it moves the number.

The hybrid approach is defensible. The problem is that "top 20 by spend" often does not equal "top 20 by emissions," and figuring out which categories are emissions-heavy requires — yes — a first pass that uses the same spend-based factors you were trying to avoid.

The part nobody tells you

Here is where the polite industry literature gets quiet and I won't.

First, spend-based emission factors are often wrong by 30 to 50 percent, and sometimes by 2x. The underlying input-output tables are built on national statistical surveys from 2016, 2018, or 2012 depending on the region, then inflated to the current year with a GDP deflator. They assume an average production mix that does not exist at any specific supplier. For a commodity like steel, where the range between best-in-class (EAF with renewable power) and worst-in-class (blast furnace with coal power) is roughly 10x in emissions intensity, an "industry average" number is a mathematical object that corresponds to no real factory. It gives you an answer, which looks like progress. It is not the same thing as knowing.

Second, supplier-specific data is nominally better, but suppliers hand you wildly inconsistent formats. Ask twelve suppliers for the embedded emissions of the steel coils they sell you and you'll get: one EPD (environmental product declaration) in a crisp PDF following EN 15804, one LCA report in Italian with a scanned signature page, one email that says "approximately 1.8 tCO2e per tonne, please confirm receipt," one Excel file with six sheets and a pivot table where the unit is mysteriously "kg CO2 / USD," four spreadsheets that report only Scope 1 and 2 emissions per company revenue, and three that never reply. Two of the suppliers will send numbers that are 40% apart for physically identical products. One will confuse biogenic and fossil CO2. One will quote GWP-20 instead of GWP-100. The "primary data" you collected is a crate of puzzle pieces from twelve different puzzles.

Third, most companies quietly use industry averages and call it primary data. I have read sustainability reports from household-name multinationals where the Scope 3 Category 1 methodology says "supplier-specific data where available; secondary data otherwise." Dig into the footnote and "where available" means about 8% of spend. The other 92% is Ecoinvent or USEEIO. Nothing about the methodology is technically wrong, but the language in the executive summary — "we engaged with our suppliers to collect primary data" — is doing a lot of heavy lifting. CDP scores this as if it were primary data. Auditors rarely push back. Everyone moves on.

None of this means Scope 3 is fake. It means Scope 3 is a measurement with honest uncertainty that the industry has learned to dress up as a precise number. The solution isn't to stop measuring. The solution is to get much better at extracting, normalizing, and tracking the actual data suppliers send you — so that when you say "primary data, 40% coverage" you mean it.

Timeline and why this is getting sharper, not softer

Scope 3 used to be optional. It isn't anymore.

CSRD / ESRS E1 requires Scope 3 disclosure for all 15 categories that are material, with limited assurance from day one. First reports for Wave 1 filed in 2025 for FY2024; Wave 2 filing in 2026 for FY2025.
SBTi requires a Scope 3 target if Scope 3 is more than 40% of total emissions — which it is for almost everyone.
CDP's 2026 questionnaire introduced scoring penalties for Scope 3 methodologies that rely on spend-based factors above a threshold share.
SEC's climate rule, despite its retreat on Scope 3 at the federal level, is being backfilled by California SB 253, which requires Scope 3 disclosure from FY2027 for companies doing business in California with revenue over $1B.
ISSB's S2 treats Scope 3 as core, not supplementary.

The combined effect is that the next two reporting cycles will push roughly 80,000 companies globally into producing Scope 3 numbers that an auditor will look at in detail, probably for the first time. The casual methodology era is closing.

How Formist helps

Formist is an AI-powered compliance platform built by WeCarbon. It works like a knowledgeable colleague who has read the GHG Protocol Scope 3 Standard, every EPD format in circulation, and a lot of scanned supplier PDFs — and who doesn't mind being asked to do it again tomorrow.

For Scope 3 specifically, what Formist actually does is the messy middle. You upload the supplier document — an EPD in English, an LCA report in Italian, a test certificate in Chinese, an email thread with embedded screenshots — and the Formist AI agent extracts per-unit emission factors, identifies the GWP basis and boundary (cradle-to-gate, cradle-to-grave, A1–A3), normalizes units, and files the number against the right Scope 3 category and the right procurement line. When the document is ambiguous, it flags the ambiguity instead of guessing silently. When a supplier's number looks implausible against the activity average, it says so.

For your Category 1 long tail, Formist applies spend-based factors with the methodology labelled honestly, so your report distinguishes 32% primary-data coverage from 68% secondary, instead of laundering the distinction. The same card-based data model feeds your CBAM filings, your CDP response, your ESRS E1 disclosure, and your SBTi target boundary without re-entry. You stop typing the same tonne of aluminum into four different forms.

It doesn't build your supplier engagement strategy. It doesn't pick your materiality threshold. Those are still yours. What it replaces is the three months of analyst time spent reading PDFs, copying numbers into spreadsheets, and arguing about whether a cradle-to-grave factor is comparable to a cradle-to-gate one. That work is mechanical. It shouldn't be your calendar.

The Scope 3 number you report next year will be scrutinized by auditors who now have a methodology to compare it against. The companies that spent this year wrestling their supplier data into a clean, source-cited, category-mapped dataset will have an easy time. The companies that used industry averages and called it primary data will have an interesting conversation.

Formist is built by WeCarbon, a climate-tech company with offices in Paris, Shanghai, and Dubai. It supports the GHG Protocol Scope 3 Standard, CSRD/ESRS, CBAM, EU Taxonomy, CDP, ISSB, SBTi, and 15+ other sustainability frameworks.