SelectFromData
AnalyticsLesson 10 of 13 · 15 min

Readable SQL: Style That Survives a Code Review

Two queries can return the same correct number, and one of them still gets you a comment in code review. Style in SQL isn't decoration. Every rule in this lesson exists because the sloppy version eventually produces a wrong number or an unreadable mess at three in the morning.

In one line

Write the query for the next person who reads it. That person is usually you, six weeks from now, with no memory of what b was.

Aliases that mean something

FROM orders a JOIN customers b JOIN products c works. It also forces every reader to keep a mental lookup table. Initials cost the same keystrokes and need no decoding: o for orders, cu for customers, p for products.

Qualify every column once you join

In a one-table query, channel is fine. In a three-table query, channel is a small bet that no other table will ever have a column with that name. The day one does, your query errors at best, or silently binds to the wrong table at worst. Write cu.channel and the bet is off — and the reader instantly knows where each value comes from.

Name your sort

ORDER BY 2 means "sort by whatever happens to be the second column right now." Add a column to your SELECT later and the sort silently changes meaning — no error, just a differently ordered report. ORDER BY revenue DESC survives any edit and reads like a sentence. The same goes for DESC itself: the default is ascending, so when the task says "highest first", say so explicitly.

Join only what you use

The subtlest one, and the most dangerous. Here's a clean revenue query — run it and note the total:

Clean: every joined table is used. Total = 340.
Editable · runs in your browser

Now the same query with one extra join that nothing uses — the kind that sneaks in when you copy a query and adapt it. The products table here happens to contain a duplicated product_id, which real dimension tables sometimes do after a bad load. Run it and watch the Distributor number:

One dead join, and Distributor revenue silently doubles: 220 becomes 440.
Editable · runs in your browser

Nothing in the SELECT mentions products. The query still "works", returns the same shape, and one number is now double. That's why a join you don't use isn't harmless clutter — it's risk with zero payoff. If no column from a table appears in your SELECT, WHERE, or GROUP BY, the join goes.

One more note on that revenue line: SUM(o.quantity * o.unit_price) multiplies two columns from the same row of the joined result. After a join they could just as well come from two different tables (o.quantity * p.price) — once the rows are joined, it is all plain row arithmetic.

Say INNER when you mean it

LEFT JOIN answers a specific question: "keep every left row, even without a match." If your report only makes sense for orders that HAVE a customer, plain JOIN (inner) states that intent. Using LEFT everywhere as a habit invites NULL groups into your results and tells the reader something you don't mean.

From practice

The five-second checklist before you submit any multi-table query: meaningful aliases, every column qualified, sort named (with DESC if you mean it), every join earning its place, and INNER unless you truly want unmatched rows kept.

Check yourself

A revenue query joins a products table but never uses any of its columns. What is the REAL risk?

Check yourself

Why is ORDER BY revenue DESC better than ORDER BY 2 DESC?


Next time you finish a case, reread your query with this checklist before submitting. The graders here check your numbers; a colleague checks your style — and this lesson is what they'd say.