Lesson 5 — Text analytics & shipping the final report
Time: ~30 min. You’ll be able to:
- Use Claude Code for the analytics task that pre-AI tools couldn’t do well: extracting themes from thousands of free-text reviews
- Iterate a classification taxonomy when the first pass produces a giant “other” bucket
- Assemble a stakeholder-ready report that combines numbers from Days 1–4 with qualitative themes from today
- Keep the business interpretation as yours — don’t let Claude write the recommendations
This is the lesson that justifies AI’s place in the analytics stack. Everything before today, a Python script could have done. The text analysis below could not.
Why text is genuinely AI-shaped work
The Olist reviews include review_comment_message — free Portuguese text from customers. Roughly 40% of reviews have a comment. Among 1-star reviews, that’s tens of thousands of short complaints.
You cannot, with Excel / SQL / vanilla pandas:
- Read 30,000 Portuguese comments and find the patterns
- Distinguish “the delivery was late” from “the product was wrong”
- Translate, classify, and summarise at scale
You can with a 2018 setup involving NLP libraries (NLTK, spaCy, custom training data) — but that’s a week of setup for a one-day analysis. Claude Code makes it 30 minutes.
The theme-extraction workflow
Six steps. Time-box each.
1. Filter to unhappy reviews with text (2 min)
INPUT: data/olist/olist_order_reviews_dataset.csv
TASK: Filter to rows where review_score IN (1, 2) AND review_comment_message
is non-empty. Tell me how many rows you have.
OUTPUT: the filtered count, and a sample of 5 comments to confirm
they're Portuguese and they're complaints.
You should see ~9–12K rows. Skim the 5 samples to confirm scope.
2. Propose an initial taxonomy (5 min)
TASK: Without classifying yet, sample 50 of the filtered comments and
propose 5–7 mutually exclusive theme labels that would cover them.
For each, give a name (short, snake_case) and a one-sentence definition.
OUTPUT: a Markdown table: name | definition | example_comment_in_portuguese.
You’re letting Claude do the cognitive work of seeing patterns. The output is a proposed taxonomy — not the final one.
Read the proposed themes. Push back if any feel redundant or vague. A starter set for Olist:
| theme | meaning |
|---|---|
delivery_late | Slow shipping, took too long, didn’t arrive on time |
no_show | Order never arrived at all |
wrong_product | Wrong item / different from description |
damaged | Item arrived broken or damaged |
quality | Item works but is low quality / not as described |
other | Doesn’t fit the above |
The other bucket is the relief valve. If it’s <15% in the final pass, the taxonomy is good. If it’s 30%+, the taxonomy needs more buckets.
3. Classify all comments (5 min)
INPUT: the filtered review comments from step 1.
TASK: Classify each comment into one of the themes from step 2.
The comments are in Portuguese — handle that.
Reviews can have multiple themes; pick the dominant one (the primary
complaint).
OUTPUT: a CSV at capstone/day5_ai/review_themes.csv with columns:
order_id, review_score, review_comment_message, theme.
Plus a summary table: theme | count | pct_of_total.
Claude runs through all rows. This takes a minute or two for ~10K rows.
4. Spot-check the classification (5 min)
The verification habit from Lesson 3, applied harder:
TASK: Pick 5 reviews at random from each theme. Show me:
- the original Portuguese comment
- the Claude-assigned theme
- a 1-line English summary of what the comment actually says
For each, ask yourself: does the theme match the comment? You may not speak Portuguese. That’s fine — Claude’s English summary is what you’re checking. If the summary doesn’t fit the theme, the classification is wrong; investigate.
Patterns to watch for:
- Generic comments classified specifically. “Terrible service” → assigned to
delivery_latewhen it could be anything. Ask Claude to mark vague comments asother. - Mixed complaints picked arbitrarily. A comment about both “late and damaged” picked as just one. Decide whether to count once or to allow multi-theme.
- English mixed in. A few Olist comments are in English. Worth a separate spot-check.
5. Iterate the taxonomy (5 min)
If the other bucket is over 20% or the spot-check found classification errors, iterate:
The 'other' bucket has 28% of reviews — too high. Sample 20 reviews from
'other' and tell me what they're about. Then propose 1–2 new theme
categories I should add.
Apply the suggested additions. Re-classify. Re-spot-check. Two rounds is usually enough.
6. Cross with sellers/categories (5 min)
The real analytical payoff: which sellers or categories are dominated by which themes?
INPUT: capstone/day5_ai/review_themes.csv, plus data/olist/olist.db
(orders, items, sellers, products tables).
TASK: For each of the top 10 risky sellers from capstone/day2_sql/risky_sellers.csv,
compute the breakdown of their bad reviews by theme. Same for the
top 5 worst categories from capstone/day2_sql/worst_categories.csv.
OUTPUT: two tables — one per seller, one per category. Each row shows the
seller/category, total bad reviews, and the % breakdown per theme.
This is the answer to the business question. Now you know whether seller X’s problem is logistics (fix it via SLAs / carrier change) or product quality (fix it by dropping the seller or auditing inventory).
Assembling the final report
The capstone deliverable is a 1-page markdown file: capstone/day5_ai/final_report.md. Required sections:
| Section | What’s in it | Where the data comes from |
|---|---|---|
| TL;DR | 3 bullets — punchline, 30 seconds to read | Synthesised across all days |
| The data | One paragraph about Olist, the time range, the question | Generic intro |
| Finding 1 | Delivery time drives review score | Day 3 chart delivery_vs_review.png + Day 2 numbers |
| Finding 2 | A concentrated set of sellers cause disproportionate harm | Day 2 / 3 risky_sellers, Day 4 dashboard table |
| Finding 3 | A meaningful share of bad reviews are product / quality, not delivery | Today’s theme analysis |
| Dashboard | Embed dashboard.png from Day 4 | Day 4 |
| Recommendations | 3 specific actions for Olist’s ops team | Your judgment — not Claude’s |
The first six sections Claude can draft well. The seventh is where you stop and write yourself.
Drafting with Claude
INPUT: All CSVs in capstone/ (day2_sql/, day3_python/, day5_ai/),
and the structure I'll paste below.
TASK: Draft a 1-page markdown report following the structure.
Use real numbers from the CSVs — don't invent any.
For each number, add a parenthetical citing the source file.
OUTPUT: capstone/day5_ai/final_report.md.
Structure: [paste the table above with your section names]
Read the draft. Trace every number to its source CSV. Fix anything that doesn’t reconcile.
Replace the recommendations yourself
For the recommendations section, close Claude Code and write it yourself. Five minutes. Three concrete actions for Olist’s marketplace ops team. Specific names of sellers, specific categories, specific interventions.
Example shape (don’t copy — write your own based on what you found):
## Recommended actions
1. **Logistics intervention for the 5 worst-delivery sellers**
(seller IDs: X, Y, Z, …). Their on-time orders rate as well as the
marketplace average — the issue is fulfillment, not the product.
Action: SLA enforcement or carrier change.
2. **Quality review for `furniture` and `electronics`.**
Even on-time orders in these two categories score 1.2 stars below average.
Action: spot-check seller listings; audit return rates per SKU.
3. **Make the delivery estimate visible at checkout.**
The data shows estimate accuracy correlates with satisfaction independently
of actual speed — customers tolerate a slow delivery they expected but not
a fast one that broke the estimate. Action: tighten estimator,
show it prominently.
This section is why analysts get hired. Don’t outsource it.
Sanity-check the final report
Before submitting:
- Every number is traceable to one of the day-2/3/4 CSVs or to today’s theme analysis
- The chart from Day 3 (
delivery_vs_review.png) renders - The dashboard PNG from Day 4 renders
- The recommendations are in your own words and feel specific
- TL;DR is exactly 3 bullets — not 2, not 5
- Total length: 1 page when rendered. Cut if longer.
??? note “Try it yourself — propose a taxonomy” Open Claude Code. Use the Lesson 2 INPUT/TASK/OUTPUT template to ask:
*"Sample 50 reviews from olist_order_reviews_dataset.csv where review_score is 1 or 2 and the comment is non-empty. Propose 5–7 mutually exclusive theme categories that cover them. Don't classify yet — just propose."*
Read Claude's proposed taxonomy. Spot-check 3 of the example comments against the proposed themes. Note any themes that overlap or feel vague.
Then iterate: ask Claude to propose 1 better taxonomy after you point out the overlaps.
??? success "What good iteration looks like"
**First pass** might give you: `delivery`, `product_quality`, `customer_service`, `wrong_item`, `other`.
Problems you might spot:
- `delivery` is too broad — split into `delivery_late` and `no_show`?
- `customer_service` doesn't show up in the actual reviews; Claude inferred it generically.
**Second pass**, after pushback: `delivery_late`, `no_show`, `wrong_product`, `damaged`, `quality`, `other`. Tighter, all five visible in the data.
Two rounds of iteration is normal. More than three suggests the data doesn't cluster cleanly — accept a bigger `other` bucket or pivot to a finer-grained taxonomy.
Common pitfalls
- Letting Claude write the recommendations. Generic recommendations are the giveaway that no human thought about the data. Always write yours.
- Trusting the first taxonomy. Always do at least one spot-check pass.
- Numbers in the report that don’t reconcile. Every TL;DR bullet needs a Day 2/3/4 source. If you can’t cite it, cut it.
- Embedding screenshots without testing the markdown. Open the report after writing; confirm both PNGs render.
- Submitting a report longer than one page. The constraint isn’t aesthetic — it’s the test of whether you’ve actually synthesised. Cut.
You’ve finished the lessons
Take the self-test — 12 questions covering the five lessons. Then ship: Day 5 capstone — final report + presentation.