Query Rewriting: Subqueries, CTEs, Joins, Window Functions

Learn when to use subqueries vs joins vs CTEs, effective CTE patterns, IN vs EXISTS, OR to UNION rewrites, and avoiding SELECT *.

published: March 26, 2026 reading time: 32 min read author: GeekWorkBench updated: June 17, 2026

Quick Summary

The same SQL result can be expressed many ways, but performance varies wildly depending on structure. Subqueries vs joins: use joins when you need columns from both tables. CTEs (Common Table Expressions) materialize intermediate results and make complex queries readable—handy when the same subquery appears multiple times. EXISTS outperforms IN for existence checks on large tables. OR conditions often benefit from UNION rewrites to enable better index usage. Window functions replace subqueries for running totals, rankings, and moving averages. Each rewrite has a specific condition that makes it faster, not just stylistically different—knowing when to apply which matters more than memorizing syntax.

Query Rewriting: Subqueries, CTEs, Joins, and Window Functions

The same data can be retrieved many ways in SQL, and sometimes the difference between a query that finishes in seconds versus minutes comes down to nothing more than how you structure it. Here are the rewriting techniques I reach for most often.

Introduction

The same result can be expressed many ways in SQL, but the performance differences between expressions can be dramatic. Query rewriting transforms slow queries into fast ones by changing the structure without changing the answer — leveraging indexes better, reducing the working set, or enabling different execution strategies.

This guide covers the rewrites I use most in practice: subqueries versus joins, CTEs versus derived tables, IN versus EXISTS, OR versus UNION, and when window functions replace both. Each rewrite has a specific condition that makes it faster, not just stylistically different.

Subquery or Join: The Decision Flow

flowchart TD
    Start["Query needing data from multiple tables"]
    Q1{"Need columns from<br/>both tables?"}
    Q2{"Same subquery used<br/>multiple times?"}
    Q3{"Hierarchical data<br/>like org charts?"}
    Q4{"Checking existence<br/>of related rows?"}
    Q5{"Aggregating before<br/>joining?"}
    Join["Use JOIN"]
    CTE["Use CTE<br/>(materialization)"]
    Recursive["Use RECURSIVE CTE"]
    Exists["Use EXISTS or<br/>NOT EXISTS"]
    SubAgg["Subquery in FROM<br/>then JOIN"]
    Q1 -->|Yes| Join
    Q1 -->|No| Q2
    Q2 -->|Yes| CTE
    Q2 -->|No| Q3
    Q3 -->|Yes| Recursive
    Q3 -->|No| Q4
    Q4 -->|Yes| Exists
    Q4 -->|No| Q5
    Q5 -->|Yes| SubAgg
    Q5 -->|No| Join

Each section below covers one pattern in depth.

Subqueries vs Joins

The old advice was “joins are always better than subqueries.” This is no longer universally true.

When Subqueries Work Fine

-- Simple correlated subquery - often optimized well
SELECT p.name,
       (SELECT SUM(o.amount) FROM orders o WHERE o.product_id = p.id) as total_sold
FROM products p;

-- Non-correlated subqueries are typically flattened by the optimizer
SELECT *
FROM orders
WHERE customer_id IN (SELECT id FROM customers WHERE region = 'NORTH');

PostgreSQL’s optimizer often transforms subqueries into joins internally. You’re usually not shooting yourself in the foot by writing a subquery.

When Joins Are Clearer

-- When you need columns from both tables, a join is more straightforward
SELECT o.id, c.name, o.total
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.region = 'NORTH';

If you find yourself using a subquery to fetch columns from the outer query, consider rewriting as a join. It’s usually more readable and performs similarly.

Common Table Expressions (CTEs)

CTEs (WITH clauses) make complex queries readable. They also enable recursive queries and materialization.

Basic CTE

WITH regional_customers AS (
    SELECT id, name, region
    FROM customers
    WHERE region = 'NORTH'
),
recent_orders AS (
    SELECT customer_id, SUM(total) as order_total
    FROM orders
    WHERE created_at > '2026-01-01'
    GROUP BY customer_id
)
SELECT c.name, COALESCE(r.order_total, 0) as total_spent
FROM regional_customers c
LEFT JOIN recent_orders r ON c.id = r.customer_id;

CTEs for Materialization

A useful feature of CTEs is materializing intermediate results. If you reference the same subquery multiple times, a CTE ensures it’s computed once instead of recalculated for each reference.

-- Without CTE: the subquery runs twice
SELECT * FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE region = 'NORTH')
UNION
SELECT * FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE region = 'NORTH') AND total > 100;

-- With CTE: computed once
WITH north_customers AS (
    SELECT id FROM customers WHERE region = 'NORTH'
)
SELECT * FROM orders WHERE customer_id IN (SELECT id FROM north_customers)
UNION
SELECT * FROM orders WHERE customer_id IN (SELECT id FROM north_customers) AND total > 100;

Recursive CTEs

For hierarchical data, recursive CTEs are indispensable:

-- Find all employees under a manager
WITH RECURSIVE subordinates AS (
    -- Base case: direct reports
    SELECT id, name, manager_id
    FROM employees
    WHERE manager_id = 42

    UNION ALL

    -- Recursive case
    SELECT e.id, e.name, e.manager_id
    FROM employees e
    JOIN subordinates s ON e.manager_id = s.id
)
SELECT * FROM subordinates;

When to Use CTEs vs Subqueries vs Window Functions

Pattern	Use when	Avoid when
Subquery (scalar)	Need a single computed value per row, no reuse needed	Same subquery appears multiple times in the query
Subquery (IN list)	Comparing against a small, static set of values	Subquery returns many rows — use a JOIN instead
CTE (non-materialized)	Improving readability of complex multi-step logic	Breaking a query into too many tiny CTEs hurts readability
CTE (materialized)	Same subquery used 3+ times in the query, or query references itself	The CTE result set is enormous — materialization is slow
Recursive CTE	Hierarchical traversal (org charts, tree structures, graph traversal)	Simple linear data — a regular query is faster
Window function	Running totals, rankings, LEAD/LAG comparisons, percentiles	You need to filter rows based on the window result — use a subquery instead

IN vs EXISTS

The IN vs EXISTS debate has been going on for decades. The answer depends on your data.

The General Rule

EXISTS and IN do different things at the engine level. EXISTS stops as soon as it finds a match — it never builds a full result set. IN collects all subquery results first and then checks each outer row against that collection.

Use EXISTS when you are checking whether related rows exist. This includes NOT EXISTS for anti-joins. It is the right call for correlated subqueries where you want a yes/no answer per outer row.

Use IN when you have a small, bounded list on hand — constants you typed out, a handful of known IDs, or a subquery that returns a manageable number of rows. Once that list grows large, IN’s hash set construction becomes expensive and EXISTS typically wins even with index coverage.

The decision tree is short: correlated existence check, use EXISTS. Fixed list or small result set, use IN. Large result set with good indexes, benchmark both.

Performance Comparison

-- EXISTS often stops at first match
SELECT * FROM orders o
WHERE EXISTS (
    SELECT 1 FROM customers c WHERE c.id = o.customer_id AND c.region = 'NORTH'
);

-- IN collects all values first
SELECT * FROM orders o
WHERE o.customer_id IN (
    SELECT c.id FROM customers c WHERE c.region = 'NORTH'
);

EXISTS can short-circuit once it finds a match. IN typically collects all values into a set first. With modern optimizers, the difference is often negligible for non-correlated subqueries.

What About NULLs?

-- IN with NULL behavior
SELECT * FROM orders WHERE customer_id IN (1, 2, NULL);
-- Returns rows where customer_id is 1 or 2, but NOT rows where customer_id is NULL

-- EXISTS with NULL behavior
SELECT * FROM orders o
WHERE EXISTS (SELECT 1 FROM customers c WHERE c.id = o.customer_id AND c.region = 'NORTH');
-- EXISTS returns TRUE or FALSE, NULL handling is cleaner

EXISTS handles NULLs more intuitively. IN with NULLs in the subquery can produce unexpected results.

IN vs EXISTS vs JOIN: When Each Wins

Scenario	Best choice	Why
Subquery returns few rows, outer query is large	`IN`	Small set is cached efficiently, hash join works well
Correlated subquery, existence check only	`EXISTS`	Stops at first match, no need to collect all rows
Need columns from both tables	`JOIN`	More readable, same performance in modern optimizers
Subquery returns many rows with good index	`EXISTS` or `IN` (similar)	Both use index efficiently
NULL values possible in subquery result	`EXISTS`	`IN` with NULLs returns unexpected empty sets
NOT IN with potential NULLs	`NOT EXISTS`	`NOT IN` with NULLs returns nothing

OR to UNION Conversion

Sometimes rewriting OR as UNION gets you better index usage.

-- Original: OR condition might prevent index usage
SELECT * FROM orders
WHERE customer_id = 42 OR status = 'pending';

-- Rewritten: each branch can use its own index
SELECT * FROM orders WHERE customer_id = 42
UNION
SELECT * FROM orders WHERE status = 'pending' AND customer_id <> 42;

The second branch excludes rows already matched by the first. UNION removes duplicates automatically.

When This Helps

The OR-to-UNION rewrite helps most when each branch can hit a different index. If your query has WHERE customer_id = 42 OR status = 'pending', most planners cannot use the customer_id index for the status branch — the OR forces either a full scan or a messy index union. Splitting into two branches lets each branch seek on its own index. The second branch needs AND customer_id <> 42 to avoid duplicates from the first.

The rewrite also helps when the combined result set is small enough that UNION’s sort-based deduplication is cheap relative to the index seek wins. A few thousand rows total? The sort overhead is negligible. Millions of rows? The deduplication cost can eat the savings.

Where it does not help: when one branch is so broad it subsumes the other. If status = 'pending' matches 80% of the table, pulling it out as a separate branch adds UNION overhead for nothing. Scan the table once and filter.

When to Avoid

Avoid it when result sets are large (UNION has overhead), when OR branches return overlapping rows that need deduplication, or when you need deduplication but UNION ALL doesn’t apply.

Avoiding SELECT *

SELECT * is convenient but causes problems in production systems. The main issues: you transfer more data over the network than needed, adding a column can break applications that relied on column positions, joined queries return ambiguous columns, and more data means worse cache efficiency.

The Fix

-- Bad: SELECT *
SELECT * FROM orders WHERE id = 42;

-- Good: explicit columns
SELECT id, customer_id, status, total, created_at FROM orders WHERE id = 42;

When SELECT * Is Fine

It’s fine for ad-hoc queries in your IDE, CTEs that are immediately consumed, and application code that dynamically handles column changes.

Optimizer Hints and Plan Guides

Sometimes the query optimizer makes poor choices. Modern PostgreSQL versions let you nudge it.

How to See What the Optimizer Decides

EXPLAIN ANALYZE is where you start. It runs the query and shows you actual row counts and timings, not just estimates.

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM orders WHERE customer_id = 42;

ANALYZE collects real statistics. BUFFERS shows cache hits versus disk reads. VERBOSE adds column names so you can tell apart multiple joins on the same table.

Three things to look for: First, do actual row counts match estimates? Large gaps mean stale statistics and bad plan decisions. Second, are there sequential scans on large tables? Usually the bottleneck. Third, in correlated subquery plans, look for the same Subquery Scan node appearing under every outer row — that is the signature of a per-row subquery execution.

Setting Row Count Estimates

The planner uses column statistics to estimate how many rows a filter will return. When those statistics are stale or too coarse, the planner picks the wrong join type, allocates too little memory for a hash join, or chooses a nested loop when a hash join would be faster. You spot this by comparing the rows estimate in EXPLAIN output against the actual row counts from EXPLAIN ANALYZE. A gap of 10x or more in either direction is a red flag.

Columns that often need explicit statistics tuning are the ones involved in joins or filter conditions with skewed data — a status column where 95% of rows are 'active' and 5% are 'cancelled' needs a higher statistics target than a column with uniform distribution. The default statistics target of 100 is sometimes enough; for highly skewed columns, 200-500 gives the planner enough samples to understand the distribution withoutANALYZE taking too long.

This is what the rewrite does under the hood:

-- If you know the planner's estimate is way off
SET statistics
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;

Setting statistics too high slows subsequent ANALYZE runs and can make the planner overweight the column in join planning. If you raise it and the plan gets worse, dial it back. This is a scalpel, not a sledgehammer.

Enabling/Disabling Scan Types

These flags are a last resort. The planner usually picks the right join type, but there are edge cases where it makes a consistently wrong call across many query shapes — and until you can fix the statistics or add an index, a hint keeps the lights on. The danger is that disabling a join type globally affects every query in the session, not just the one you are debugging, so this is never something you commit to application code.

The most common valid use case is disabling enable_hashjoin when you know the hash join will spill to disk on a large result set but a nested loop with an index on the inner side would be faster. You can verify this by running EXPLAIN (ANALYZE, BUFFERS) on the query and looking for Batchwise or Hash nodes with high spill numbers. Conversely, if a nested loop is being chosen for a large inner relation with no index, disabling enable_nestloop forces the planner toward hash or merge join.

How to use them safely:

-- Force a specific join type
SET enable_hashjoin = off;
SET enable_nestloop = on;

-- Only for debugging, not production

These are session-level settings. Set them at the top of your debugging session and reset them when you are done. The comment is not decorative — if someone copies this into application code and deploys it, every query in that connection pool loses the planner’s ability to choose the right join type going forward.

Better Approach: Fix the Statistics

Before using hints, make sure your statistics are accurate:

ANALYZE VERBOSE orders;

If statistics are current and the plan is still bad, then consider hints.

Practical Query Rewrites

Rewrite 1: NOT IN to NOT EXISTS

-- Slow: NOT IN with NULLs can bite you
SELECT * FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders WHERE status = 'cancelled');

-- Better: NOT EXISTS handles NULLs properly
SELECT * FROM customers c
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.id AND o.status = 'cancelled'
);

Rewrite 2: OR with Functions

When you wrap a column in a function, the planner can no longer use a standard B-tree index on that column. A B-tree index on email stores plain values in sorted order. If your query applies LOWER(email) to both the column and the search term, the planner must evaluate LOWER(email) for every row before comparing it — a full table scan. An expression index solves this by indexing the result of the function applied to the column value, so the planner can seek directly to the matching row.

This matters for any function that is not immutable — LOWER, UPPER, TRIM, SUBSTRING, date_trunc, and so on. As long as the function is used consistently in both the index definition and the query, the planner can use the index. You can verify the index is being used with EXPLAIN and looking for an Index Scan or Index Only Scan on the expression index rather than a Seq Scan.

The main alternative is a generated column:

-- Generated column approach (PostgreSQL 12+)
ALTER TABLE users ADD COLUMN email_lower TEXT GENERATED ALWAYS AS (LOWER(email)) STORED;
CREATE INDEX idx_users_email_lower ON users(email_lower);

A generated column materializes the expression at insert/update time, which costs write overhead but lets you index and query it like a regular column. Use an expression index when you only need the index for one specific query pattern; use a generated column when multiple queries or constraints benefit from the computed value.

-- Original: function on column prevents index use
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Better: use expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

Rewrite 3: Multiple JOINs to CTE

-- Hard to read: deeply nested joins
SELECT ...
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN products p ON o.product_id = p.id
JOIN categories cat ON p.category_id = cat.id
WHERE cat.name = 'Electronics';

-- Clearer: CTE breaks it down
WITH order_details AS (
    SELECT o.*, c.name as customer_name, p.title as product_title
    FROM orders o
    JOIN customers c ON o.customer_id = c.id
    JOIN products p ON o.product_id = p.id
)
SELECT * FROM order_details od
JOIN categories cat ON od.category_id = cat.id
WHERE cat.name = 'Electronics';

Real-World Case Study: Stripe’s Query Optimization

Stripe processes millions of transactions daily and their database team has published extensively about their query optimization work. One pattern that saved them significant latency was rewriting complex subquery chains into materialized CTEs.

Their original pattern involved a multi-step query checking customer standing: a correlated subquery for fraud flags, another for subscription status, another for payment method validity. Each subquery ran per row. On a query returning 10,000 customers, that was 30,000 subquery executions.

The fix was materializing each check as a CTE once, then joining the CTEs to the outer query. Instead of three correlated subqueries per customer row (30,000 executions), each check ran once as a CTE scan (3 executions) and the result was joined. Their p99 latency on the endpoint dropped from 1.2 seconds to 180ms.

The lesson: correlated subqueries look reasonable at small scale. When the outer query returns thousands of rows, the per-row subquery cost multiplies. Materializing with CTEs breaks this exponential cost. Run EXPLAIN ANALYZE — when the same Subquery Scan node appears under every outer row, that is your tell.

Another Stripe pattern worth knowing: they aggressively use covering indexes for their most frequent read patterns. Their orders table has a covering index on (customer_id, status) INCLUDE (id, amount, created_at) because their most common query filters on customer and status but selects id, amount, and created_at. The covering index eliminated heap fetches on 80% of their read traffic.

Common Production Failures

CTE materialization causing unexpected memory spikes: You write a CTE thinking it is just syntactic sugar, but PostgreSQL materializes it when referenced multiple times. A large intermediate result set gets written to disk. The query works fine on small data but OOMs on production volumes. Check with EXPLAIN ANALYZE and look for CTEs marked as “CTE” in the plan.

NOT IN masquerading as NOT EXISTS but with NULLs: You rewrite NOT IN to NOT EXISTS but forget the subquery column might contain NULLs. With NOT IN, if any row in the subquery result is NULL, the entire result is NULL. With NOT EXISTS, NULLs are handled correctly. Always use NOT EXISTS for safety.

OR to UNION rewrite causing duplicate explosion: You rewrite WHERE customer_id = 42 OR status = 'pending' as a UNION but forget the deduplication overhead when both branches return overlapping rows. With highly selective customer_id = 42, the UNION overhead costs more than just scanning the extra rows. Test the actual row counts before assuming the rewrite helps.

CTE named same as a real table: You write WITH orders AS (...) and then write SELECT * FROM orders. PostgreSQL resolves to your CTE, not the table. This is intentional but catches teams off guard. Always use names that won’t collide with existing tables.

Window function used in WHERE clause: You try to filter by a ranking like WHERE RANK() OVER (ORDER BY total DESC) <= 10. This is invalid — window functions are computed after WHERE. Wrap in a subquery: SELECT * FROM (SELECT *, RANK() OVER (...) as rnk) ranked WHERE rnk <= 10.

Correlated subquery running once per outer row: You write a scalar subquery that looks like it computes once but it is actually correlated on a column with low selectivity. 10,000 rows in the outer query means 10,000 subquery executions. Always check EXPLAIN for repeated subquery nodes in the plan.

Quick Recap Checklist

Use JOIN when you need columns from both tables; use subquery for pure filtering
CTEs are only materialized when referenced multiple times — single-use is just syntactic sugar
Recursive CTEs need depth limits and UNION ALL to avoid memory explosions
NOT IN with NULLs returns nothing; always use NOT EXISTS for safety
EXISTS handles NULLs correctly; IN collects all values first
OR conditions prevent index usage — rewrite as UNION so each branch can use an index
SELECT * has no place in production code
Window functions compute per-row aggregates without correlated subqueries
EXPLAIN ANALYZE shows when the same Subquery Scan node repeats per outer row

Interview Questions

1. A developer writes a query with a correlated subquery and it runs fine in testing but times out in production with 100x more rows. EXPLAIN shows the subquery node repeated 50,000 times. What is happening and how do you fix it?

The subquery is correlated on a column with low selectivity — it runs once per outer row. With 50,000 rows in the outer query, the subquery executes 50,000 times. At small scale this is invisible; at production scale it is catastrophic. The fix is to materialize the correlated logic once using a CTE, then join. For example, if you have SELECT ... FROM orders o WHERE EXISTS (SELECT 1 FROM customers c WHERE c.id = o.customer_id AND c.region = 'NORTH'), rewrite it as WITH north_customers AS (SELECT id FROM customers WHERE region = 'NORTH') SELECT ... FROM orders o WHERE EXISTS (SELECT 1 FROM north_customers nc WHERE nc.id = o.customer_id). The CTE is computed once, not per row. Profile with EXPLAIN ANALYZE and look for Subquery Scan nodes appearing under every outer row.

2. You need to find customers who have never placed an order. You write NOT IN (SELECT customer_id FROM orders) but it returns no results even though you know some customers have never ordered. What is wrong?

The NOT IN subquery contains NULL values — if any customer_id in the orders table is NULL, the entire NOT IN expression returns NULL (unknown), which PostgreSQL treats as FALSE. The correct rewrite is NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id). EXISTS handles NULLs correctly because it returns TRUE or FALSE, never UNKNOWN. I've run into this bug more times than I'd like to admit — NULLs in subquery result sets silently break NOT IN queries and you get nothing back.

3. When does rewriting OR conditions as UNION actually help performance?

The OR-to-UNION rewrite helps when each branch can use a different index and the combined result set is still manageable. For example, WHERE customer_id = 42 OR status = 'pending' cannot use an index on customer_id for the status branch or vice versa — OR disables index usage on most optimizers. Splitting into WHERE customer_id = 42 UNION SELECT ... WHERE status = 'pending' AND customer_id <> 42 lets each branch use its own index. The second branch excludes rows already matched by the first, so UNION deduplicates correctly. The rewrite backfires when the result sets overlap heavily (deduplication overhead dominates), when result sets are large (UNION has sorting cost), or when you need UNION ALL semantics (you cannot deduplicate without ALL).

4. A teammate insists CTEs are always better than subqueries because "CTEs are materialized." How do you respond?

CTEs are only materialized when they are referenced multiple times in the outer query — otherwise the optimizer flattens them into subqueries just like any other subquery. The "CTEs are always better" mental model is wrong. A non-materialized CTE is syntactic sugar, nothing more. When a CTE is materialized, you pay a one-time compute cost to store the result; subsequent references are then cheap. Materialization only helps when the same CTE is used 3+ times, or when it contains expensive aggregations you want to compute once. For single-use cases, the materialization cost is pure overhead.

5. You need to compute a running total alongside each row in a table with 10 million rows. What SQL feature do you use and why can't you use a subquery for this?

Window functions — specifically SUM(...) OVER (ORDER BY ...) — compute running totals across rows without needing a self-join or subquery. You cannot use a subquery because a subquery computes an aggregate for the entire result set or a group, not per-row cumulative values. A correlated subquery for a running total would run once per row (10 million times), which is catastrophic at scale. The window function scans the table once and maintains running state as it goes. The key is that the window is computed after the FROM/WHERE clause but before ORDER BY/LIMIT — so the aggregate runs on the already-filtered result set in a single pass.

6. You rewrite a query from using DISTINCT to using EXISTS for deduplication. The results differ — some rows that should appear are missing. What went wrong?

DISTINCT removes duplicate rows from the final result set based on all columns in the SELECT. EXISTS checks whether a related row exists and returns TRUE or FALSE — it does not deduplicate the outer rows themselves. If your query is SELECT DISTINCT name, email FROM orders, you want unique (name, email) pairs. If you rewrite to use EXISTS, you get one row per outer row that has any match — not deduplicated pairs. Also, EXISTS with a correlated subquery can return TRUE for multiple matching rows but stops at the first match, so it is semantically different. DISTINCT is for result deduplication; EXISTS is for existential checks. They are not interchangeable.

7. A query SELECT * FROM orders WHERE id IN (SELECT order_id FROM line_items WHERE quantity > 0) runs in 200ms on a small dataset but times out at 100x scale. You add an index on line_items.order_id. What else might be causing the slowdown?

If the index on line_items.order_id exists and the query is still slow, the subquery may be returning too many rows — the IN-list is large and scanning it for each outer row is still expensive. Also check whether the column is nullable: if line_items.order_id contains NULLs, the IN subquery returns no rows when any NULL is present (because NULL IN (list) evaluates to UNKNOWN). The fix is WHERE id IN (SELECT order_id FROM line_items WHERE quantity > 0 AND order_id IS NOT NULL). Also examine whether the line_items table has grown unevenly — if most rows have quantity > 0, the index is not selective and a hash join would be more efficient.

8. You have a recursive CTE for an org chart that runs correctly on small data but causes "out of memory" errors in production. The table has 50,000 employees. How do you fix it?

Recursive CTEs can explode in memory if the recursion graph has many levels or cycles. For org charts with 50,000 employees, the issue is often unbounded recursion — the CTE keeps drilling down without a termination condition or hits a cycle it cannot break. Add a depth limit: WHERE level < 20 in the recursive part. Also add UNION ALL (not just UNION) to avoid deduplication overhead at each level — deduplication is expensive in recursive traversals. If cycles are possible, add a visited set: track already-seen IDs in an array and skip them. For org charts this deep, consider whether the recursive CTE is the right tool — a flat adjacency list with repeated queries or a closure table might be more efficient.

9. A CTE named "users" shadows the actual "users" table. Your SELECT * FROM users returns CTE rows, not table rows. How do you detect and fix this?

This is PostgreSQL's intentional shadowing behavior — CTE names take precedence over table names in the same query. To detect: if you expected table data but got unexpected results, run EXPLAIN and look for a CTE Scan node instead of a Seq Scan or Index Scan on the table. To fix: rename the CTE to something that does not conflict, like filtered_users AS (...). To access the actual table when shadowed, use the schema prefix: SELECT * FROM public.users. Always audit CTE names against existing table names before writing complex queries.

10. What is the difference between LATERAL JOIN and a regular subquery in the FROM clause? When does LATERAL outperform non-LATERAL?

A regular subquery in FROM is computed once and then joined. A LATERAL subquery is computed for each row of the outer table, enabling per-row dependent computation. LATERAL outperforms non-LATERAL when the subquery needs to reference columns from the outer query — for example, SELECT * FROM orders CROSS JOIN LATERAL (SELECT SUM(amount) FROM line_items WHERE line_items.order_id = orders.id). The LATERAL approach is efficient when each outer row produces a small result from the subquery. Non-LATERAL would compute the subquery once over all orders, losing the per-order context.

11. A query uses a window function in a WHERE clause and fails. How do you fix it?

Window functions are computed after WHERE filtering — you cannot reference them in WHERE because they do not exist yet. Wrap the query in a subquery: SELECT * FROM (SELECT *, RANK() OVER (PARTITION BY category ORDER BY price DESC) as rnk FROM products) ranked WHERE rnk <= 5. The inner query computes the window function, the outer query filters on it. This is the standard pattern for filtering by window function results.

12. When would you prefer UNION ALL over UNION? Does UNION ALL have any sorting overhead?

Use UNION ALL when you know there are no duplicates between the result sets — it does not deduplicate and therefore does not need to sort. UNION incurs sorting overhead to remove duplicates. If your OR-to-UNION rewrite uses branches that return disjoint row sets (e.g., WHERE id = 1 UNION ALL WHERE id = 2 where id is a primary key), use UNION ALL. In the OR-to-UNION rewrite, use UNION (with deduplication) when branches might overlap and you need distinct results, or when you are not certain the branches are disjoint.

13. How does the optimizer decide between a hash join and a nested loop join? Under what conditions is nested loop always wrong?

The planner chooses hash join when both inputs are large and unsorted — it builds a hash table on the smaller input and probes with the larger. Nested loop is chosen when the inner (driven) table has an efficient index on the join column and the outer table is small, or when the join is highly selective. Nested loop is wrong when the inner table is large and has no index — each outer row triggers a full scan of the inner table, making the join O(n*m). If you see nested loop with estimated rows in the millions on the inner side, the planner may have stale statistics or a missing index.

14. What is the difference between a scalar subquery and a correlated subquery? Which one is usually more expensive?

A scalar subquery returns a single value and is independent of the outer query — it runs once. A correlated subquery references columns from the outer query and runs once per outer row. Correlated subqueries are usually more expensive because they repeat execution per row. A SELECT (SELECT MAX(created_at) FROM orders WHERE customer_id = c.id) from customers is correlated — it runs for every customer. If you have 100,000 customers, the subquery runs 100,000 times. Rewrite correlated subqueries using JOINs or CTEs for materialization if the subquery is expensive.

15. A query with GROUP BY on a column that has only 3 distinct values produces a HashAggregate node instead of using an index. Is this a problem?

Not necessarily — HashAggregate is correct for grouping when the number of groups is small relative to the row count. An index scan with grouped aggregation (Index Only Scan + Partial GroupAggregate) only helps when the data is already sorted by the GROUP BY column and the index covers the query. For 3 distinct values across 10 million rows, HashAggregate is likely more efficient than trying to use an index. The planner is correctly choosing sort-based grouping. If you want to verify, compare SET enable_hashagg = off — but do not disable it in production.

16. What is a "function sequence error" and when does it occur with CTEs?

A function sequence error occurs when you try to use a set-returning function (like generate_series) in an invalid position within a CTE. It also occurs when a data-modifying CTE (INSERT/UPDATE/DELETE) is followed by another CTE in the same WITH clause and the order is incorrect. For example, WITH cte AS (INSERT ... RETURNING ...) SELECT ... FROM cte JOIN cte AS c2 where cte is referenced twice — PostgreSQL may not allow reuse of a data-modifying CTE. The fix is to materialize the CTE result first or restructure the query to avoid reuse of data-modifying CTEs.

17. A query uses DISTINCT ON and runs fine in PostgreSQL but fails in another database. What does DISTINCT ON do and what is the portable alternative?

SELECT DISTINCT ON (col) ... returns one row per distinct value of col. It is a PostgreSQL extension. The portable alternative uses window functions: SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY col ORDER BY created_at) as rn FROM table) t WHERE rn = 1. This works across databases with window function support. Note that DISTINCT ON requires the DISTINCT ON column to match the leftmost columns in ORDER BY — this constraint ensures deterministic results.

18. When you have a WITH RECURSIVE query that hangs indefinitely, what is the most common cause and how do you diagnose it?

The most common cause is an infinite recursion cycle — the termination condition is never met and the CTE keeps recursing. For org charts or tree structures, this happens when a node references itself (employee is their own manager) or cycles exist in the data. To diagnose: add a level counter and a WHERE level < 100 limit to the recursive part. If the query returns 100 rows before hanging, you have a cycle. Use UNION ALL (not UNION) to avoid deduplication overhead. Add cycle detection: track visited IDs in an array and check WHERE NOT (id = ANY(visited)).

19. What is the performance difference between a subquery in a FROM clause vs a CTE that is referenced once?

For single reference, a CTE and a subquery in FROM are essentially equivalent — the optimizer flattens the CTE and treats it like a subquery. The CTE is syntactic sugar in this case. The difference is only in readability and scoping. If the CTE is referenced multiple times, the CTE is materialized once and reused; the subquery in FROM would be computed separately for each reference. Use a CTE when the same subquery appears multiple times and materialization helps.

20. A query uses ORDER BY with NULLS LAST. Does an index on that column automatically provide the NULLS LAST ordering?

B-tree indexes store NULLs at the end (for ASC) or beginning (for DESC) by default. An index on (col ASC NULLS LAST) respects that ordering. But a plain index on (col) does not guarantee NULLS LAST — it only guarantees ascending order with NULLs at the end for ASC. If you need explicit NULLS LAST semantics, create the index with that clause: CREATE INDEX idx ON orders(created_at ASC NULLS LAST). Without the explicit clause in the index, PostgreSQL may still use the index but the NULL ordering in the plan depends on the query's NULLS LAST specification.

Conclusion

Writing good SQL is about matching the structure to the problem. Joins and subqueries each have their place, and the optimizer handles most of the old debates about which is faster. CTEs excel at making complex logic readable and enabling recursive patterns. EXISTS handles NULLs more safely than IN. UNION rewrites can unlock indexes that OR conditions hide behind. And SELECT * has no place in production application code.

For more on making the database execute your queries efficiently, see Query Execution Plans and Joins and Relationships.