Skip to the content.

Why leaflet data matters

Every week, retailers print and publish what is, in effect, their pricing strategy: which products are being pushed, at what discount, for how long, in which categories, with what visual prominence. Most of that data sits trapped in PDFs, JPGs, and HTML pages on aggregator sites. The companies that systematically extract and analyze it gain a structural advantage on pricing, trade promotion ROI, and category management. This page collects the evidence — and the use cases — that explain why.


The economics of trade promotion

Trade promotion — the discounts, displays, features, and circulars retailers run with funding from manufacturers — is not a marketing line item. It’s typically the second-largest expense on a CPG company’s P&L, behind only cost of goods sold.

11–27%
of CPG company revenue spent on trade promotions, ranging across categories.
Promotion Optimization Institute
~70%
of those investments don't return their cost — and most companies can't tell which ones.
POI / industry consensus
$8B
spent on grocery feature ads annually (US) — roughly equal to grocery retailers' net profit margin.
Stanford GSB
$19.8B
size of the broader US promotional products industry, growing at ~2.3% CAGR.
IBISWorld 2025

The basic problem is that trade-promotion decisions are still made on weekly or monthly POS extracts that are days or weeks old by the time they reach a revenue manager’s desk. Competitor moves are reverse-engineered from sales declines. By the time you know your rival ran a 30% milk promotion across 180 stores, the round is already over.

Leaflet data fixes the lag: the leaflet is published before the promotion runs. A pipeline that captures and structures it gives revenue, brand, and category teams a forward-looking signal — what’s coming, where, at what price — instead of a backward-looking one.


What’s actually in a leaflet

A typical retail leaflet — supermarket, hypermarket, pharmacy, electronics — contains, on every page:

That’s a structured dataset hiding inside a JPEG. Manually transcribing it is where every team trying to do this in-house gets stuck. AI vision extraction turns each page into rows in seconds — 8–24 promotions per page, multiplied across hundreds of pages per retailer per week, multiplied across every retailer in the category.


Five stakeholder use cases

USE CASE 1

CPG manufacturers — promotion effectiveness

Front-page leaflet ads are a core input to predicting promotion effectiveness. AI-powered planning runs scenarios for the whole year — but only as good as the competitive promotional data fed in. Manufacturers compare own vs competitor promotional pressure by SKU, region, and season, and reallocate trade spend toward the windows that move volume.

USE CASE 2

Distributors — pricing & replenishment

Distributors track what their retailer customers are promoting (volume signal for replenishment) and what those retailers' competitors are promoting (signal for category-wide demand shifts). The data flows directly into pricing algorithms, demand forecasts, and customer conversations about trade terms.

USE CASE 3

Brand managers — compliance audit

When a brand pays a retailer for promotional placement, the brand wants to know it actually happened — at the agreed price, on the agreed page, in the agreed weeks. Leaflet data is the primary auditable record. Compare contracted promotions against what's actually printed.

USE CASE 4

Category managers — assortment benchmarking

How does our category footprint compare to a rival chain's? Are they pushing private label harder this quarter? Is the entire category resetting on a new pack size? Leaflet archives — historical, structured, queryable — answer questions a one-off market study can't.

USE CASE 5

Pricing & revenue teams — model features

Promotion data is one of the strongest exogenous features in retail demand forecasting. Lag your sales by a week, line up the leaflet that ran in the same window, and you have a high-signal feature for both your own volume and your cannibalization model. Most teams build forecasts without it because the data is too painful to assemble.

USE CASE 6

Trade marketing — ROI attribution

Reallocating spend from the worst-performing 20% of promotions to the best-performing 20% can drive 1–2% of revenue straight to the bottom line. You can only do that if you can measure each promotion's contribution — which requires knowing every promotion that ran, not just the ones you funded.


Where the value comes from

Industry research consistently lands on a similar pattern: the upside isn’t from running more promotions, it’s from running fewer bad ones.

Reported revenue uplift from promotional intelligence Across multiple industry sources (POI, McKinsey, NVIDIA 2026, Lingaro/RGM case study) Reallocate failed promotion spend +1–2% revenue (POI) Advanced analytics & RGM +3–5% gross profit (McKinsey) RGM analytics rollout (case study) +7% YoY revenue (Lingaro) AI in retail/CPG (any uplift) 89% saw revenue lift (NVIDIA 2026) AI in retail/CPG (>10% lift) 30% saw >10% revenue lift (NVIDIA 2026)

These are different studies measuring different things, but they all point in the same direction: structured, current, competitive promotional data is one of the highest-leverage inputs a CPG or distribution business can add to its analytics stack — and most companies still don’t have it in usable form.


The maturity curve

Where companies typically sit when they start engaging with promotional intelligence:

Stage 0 — Anecdotal

Sales reps email screenshots. Marketing keeps a SharePoint of competitor PDFs. No structured archive, no time-series, no queryability.

Stage 1 — Periodic audit

An agency or analyst pulls competitor leaflets monthly and writes a deck. Insightful when fresh; stale within two weeks; not addressable as data.

Stage 2 — Subscription tools

Buy a feed from Datasembly, Circana, NielsenIQ, or a regional aggregator. Coverage varies by region; per-row costs add up; data isn't always queryable in your stack.

Stage 3 — Owned pipeline

Build or operate a continuous capture + extraction pipeline. Full control over sources, schema, retention, and downstream integrations. py-leaflets sits here.

The right answer depends on geography (how well-served your region is by existing data vendors), category (whether your SKUs are reliably tagged in third-party feeds), and how many of your downstream use cases need the raw data vs aggregated charts. For most distribution companies and mid-size CPG manufacturers operating outside the US grocery mainstream, Stage 3 is materially cheaper than Stage 2 within 12 months — and unlocks use cases (custom features, compliance audit, model integration) that subscription tools don’t address.


Regional context — MENA / UAE

py-leaflets’s first source covers the UAE. The regional data points are worth flagging:

Why MENA is structurally hard for traditional retail data. Over 90% of UAE food is imported, the seven emirates have separate administrative rules, and the retail landscape ranges from international hypermarket chains to single-emirate retailers. Most subscription tools cover the chains and miss the rest. A pluggable scraper architecture matches the fragmentation better than a one-vendor data subscription.

Channel-specific dynamics:

For a distribution company operating in this market, no single subscription covers all of those without compromise — which is exactly the gap a pluggable, source-agnostic pipeline fills.


What still needs to be true for value to land

Honest framing: leaflet data is necessary but not sufficient. To turn it into P&L impact a business needs three other ingredients:

  1. A clean SKU master. Promotions match against products. If your internal product catalog isn’t reliable, no amount of competitor data will drive better decisions.
  2. A decision owner. Someone — pricing manager, trade marketing lead, category manager — needs to be empowered to act on the signal. Data without an owner is a dashboard nobody opens.
  3. A measurement loop. Whatever you change because of the data, measure the result. The 1–2% revenue gain only materializes if you actually rebalance, not just observe.

py-leaflets handles the first ingredient on the data side. The second and third are organizational — but they’re where the ROI lives.


Sources

Ready to see this in your category?

Pilot a source, integrate with your stack, or read the technical implementation.

Technical Overview →