Lesson 2 — Prompting that works
Time: ~25 min. You’ll be able to:
- Write prompts that consistently produce useful, verifiable output
- Use the INPUT / TASK / OUTPUT template — the most useful 30 seconds of the day
- Be specific about file paths, columns, scopes, and result shapes
- Recognise when a prompt is too vague and tighten it
- Iterate on prompts when the first answer isn’t right
If you take one thing from Day 5, take this lesson. The difference between a 10-second answer that’s correct and a 30-minute analysis that’s generic is mostly about the prompt.
The single most useful prompt template
INPUT: <which file(s), which columns, which subset>
TASK: <what to do in plain English>
OUTPUT: <what shape of result you want — a table, list, summary, chart>
Three lines. Used consistently, they produce a 5-10× quality lift over freeform prompting.
Compare two prompts
Too vague:
Analyse the reviews.
What you’ll get back: a generic 200-line essay with high-level observations, half of which don’t apply to Olist, and zero verifiable numbers.
Specific:
INPUT:
data/olist/olist_order_reviews_dataset.csv, filtered to rows wherereview_score <= 2andreview_comment_messageis non-empty. TASK: Sample 200 of those reviews. Classify each into one of six themes:delivery_late,wrong_product,damaged,quality,no_show,other. The comments are in Portuguese — handle that. OUTPUT: a table withtheme | count | three_sample_comments, plus the assumptions you made when classifying.
What you’ll get back: an actual table with numbers you can verify and decisions you can push back on.
The difference isn’t AI cleverness; it’s specificity.
Specificity, in detail
Be specific about inputs
| Vague | Specific |
|---|---|
| ”the reviews" | "rows in olist_order_reviews_dataset.csv where review_score <= 2" |
| "the orders" | "the orders table in olist.db, joined with customers to get state" |
| "the data” | (don’t say this) |
File paths and column names beat noun phrases. Claude has no way to disambiguate “the data” without asking; pre-empt the question.
Be specific about scope
| Vague | Specific |
|---|---|
| ”all customers" | "customers with at least 5 orders" |
| "recent orders" | "orders placed after 2018-01-01" |
| "bad reviews" | "reviews with score 1 or 2 and a non-empty comment” |
Edge cases (NULLs, empty strings, undelivered orders) are where vague prompts produce wrong numbers. Spell out the filter.
Be specific about output shape
| Vague | Specific |
|---|---|
| ”summarise" | "a 3-row Markdown table: column, count, average score" |
| "tell me what’s happening" | "TL;DR in 3 bullets; then 1 short paragraph per finding" |
| "make a chart" | "matplotlib bar chart, x = bucket, y = avg review, saved to out.png” |
Output shape is what makes the answer comparable to other answers — yours, your colleague’s, last week’s. Without it, you can’t verify; you can only nod.
A useful preamble for analytical tasks
Stick this at the top of your message when working with the Olist data:
Context: We're working with the Olist Brazilian e-commerce dataset.
- ~100K orders, 2017–2018.
- SQLite DB at data/olist/olist.db loaded from CSVs.
- Empty CSV cells become EMPTY STRINGS (''), not NULLs in this DB.
- Date columns in CSVs need parse_dates= when loaded via pandas.
- Product categories are in Portuguese — translate via the category_translation table.
You don’t need to repeat this every prompt — Claude carries context within a conversation. But for the first prompt of a session, this preamble eliminates the most common silent mistakes.
Iterating when the first answer is off
The first response is rarely the final one. Three iteration patterns:
1. The targeted correction
Your query joined on
customer_idbut I want to count distinct customers across both years — usecustomer_unique_id. Re-run and show the new numbers.
Don’t ask Claude to “try again” or “do it better.” Tell it specifically what to change.
2. The “show me first” follow-up
Before computing the average, show me the histogram of
delivery_daysso I can see whether outliers are skewing the mean.
When a number surprises you, demand the intermediate evidence before accepting it.
3. The reconciliation
My Day 2 SQL gave
late = 2.3andon_time = 4.2. Your pandas version gave 2.5 and 4.0. Find the discrepancy — is it the join, the filter, or the dedupe?
This is the move that turns Claude from “stochastic answer generator” into “debugging partner.”
What to avoid
| Don’t say | Why |
|---|---|
| ”Be creative” | Creativity is what’s making it generic. |
| ”Make assumptions if needed” | It will. Without telling you which. Better: “list any assumptions explicitly." |
| "Use best practices” | Means nothing. State the constraint you care about. |
| ”Make it production-ready” | We’re doing analysis, not shipping software. |
| ”Be concise” | Useless without a target length. Say “TL;DR in 3 bullets.” |
The prompt I use most
A worked example I reach for constantly:
INPUT: <file path or table name>, columns <list>, filtered to <condition>
TASK: Compute <specific metric> grouped by <dimension>. Use <approach>
(SQL / pandas / etc).
OUTPUT: A table with columns <a, b, c>. Sort by <column> <direction>.
Top <N> rows. Show me the code you ran before showing the result.
Notice “show me the code before the result.” That single phrase shifts Claude from “produce an answer” to “produce a verifiable artifact.” You can now read the code, spot the mistake before trusting the number, and fix it specifically.
??? note “Try it yourself — rewrite a vague prompt” Here’s a vague prompt:
> Find the worst sellers in the dataset.
Rewrite it in INPUT / TASK / OUTPUT form so it produces a specific, verifiable answer. Constraints to think about: what defines "worst"? Which seller table? How many to return? What columns?
Drop your rewritten prompt into Claude Code and see what it returns.
??? success "Reveal a good rewrite"
```
INPUT: `data/olist/olist.db` — tables `items`, `reviews`, `orders`,
`sellers`. Empty cells are empty strings, not NULL.
TASK: For each seller with at least 50 orders, compute:
- total revenue (SUM of items.price)
- average review score across orders containing the seller's items
- % of reviews that are 1 or 2 stars
Rank sellers by `revenue * (1 - avg_review_score / 5)` descending —
this approximates "financial impact of bad reviews."
OUTPUT: a Markdown table of the top 10, with seller_id, revenue,
avg_review_score, pct_bad, and rank_score. Show me the SQL
you ran before the table.
```
Versus the original "find the worst sellers" — that produces a generic top-10 by review count. Your rewrite produces the *capstone-ready* answer.
Common pitfalls
- “Help me with X.” Replace with “INPUT / TASK / OUTPUT for X.”
- Asking for everything at once. Five tasks in one prompt produces five mediocre answers. Chain them in separate prompts.
- Hidden assumptions about file content. If Claude needs to know that empty strings mean missing, tell it. Don’t expect it to remember from yesterday’s session.
- Confusing “make it look good” with “make it right.” Visual polish ≠ correctness. Demand the underlying numbers first.
- Skipping “show me the code.” Without that phrase, you’ll be trusting opaque output. With it, you’re an editor of code, which is what you actually want to be.
What’s next
Continue to Lesson 3 — Plan mode & verification — the structured way to catch misunderstandings before they become wrong work.