Generate hundreds of thousands of rows of realistic, multi-year retail data โ across nine fully-relational tables โ in seconds. AdventureWorks-style schema, scientifically grounded customer behavior, fully reproducible.



Six product decisions that turn synthetic data from "dummy CSVs" into a production-grade dataset you can train models on, ship demos with, or teach from.
Generate hundreds of thousands of sales lines across nine relational tables in seconds. Streaming output keeps memory flat โ push to millions of rows on a laptop without breaking a sweat.
Cohort behavior anchored in buy-till-you-die customer-base models from the marketing-science literature. Discrete choice from McFadden's Nobel-winning framework. Schema follows Microsoft AdventureWorks. Ten cited papers.
Twenty-plus controls over market, seasonality, inflation rate, returns probability, promotion frequency, customer field completeness, basket composition. Every realism axis has a knob.
Plain CSV. Works with pandas, Excel, Power BI, Tableau, dbt, DuckDB, Postgres, Snowflake โ anything that reads a delimited file. No vendor lock-in. No row caps. No throttling.
US, GCC, and EU presets out of the box. Locale, currency, VAT, weekend (Sat/Sun vs Fri/Sat), payment methods, plus market-specific holiday calendars (Black Friday, Christmas, Ramadan, Eid, Boxing Week) โ all swappable.
One seed value drives every random number in the system. Run twice โ identical CSVs, byte-for-byte. Continuous integration verifies this on every push. Your demos won't drift.
No accounts, no API keys, no cloud setup, no NDAs. Pure Python, pure CSV out, pure determinism.
Pick a market โ US, GCC or EU. Set a date range. Decide how many customers, products, stores, and promotions you want. Every realism axis comes with a sensible default and an override.
Realistic, multi-year retail data fans out across nine fully-relational tables in seconds. Streaming output keeps memory flat, so generating millions of rows on a laptop is routine.
Plug straight into pandas, Power BI, Tableau, dbt, DuckDB, your warehouse โ anything that reads CSV. No vendor lock-in, no row caps. The data is yours, forever, byte-reproducible.
Whether you're prototyping a churn model, demoing a dashboard, teaching SQL, or staging a data pipeline โ erp-synth gives you data that looks and behaves like the real thing, without an NDA.
One workflow, a few seconds, hundreds of thousands of rows. Or browse the technical deep-dive for the schema, cohort math, and bibliography.