Skip to content

Optimizer, Indexes, EXPLAIN

The optimizer translates declarative SQL into a physical plan. Your job is to make the intent clear, provide useful statistics and access paths, and verify the chosen plan.

flowchart LR
	Query[SQL text] --> Parse[Parse and validate]
	Parse --> Logical[Logical algebra]
	Logical --> Rewrite[Semantic rewrites]
	Rewrite --> Cost[Cost-based search]
	Cost --> Plan[Physical plan]
	Stats[Stats and constraints] --> Cost
	Indexes[Indexes / partitions / files] --> Cost
	Plan --> Execute[Execution]
EstimateWhy it matters
table cardinalityfull scan cost and join order
predicate selectivitywhether an index/filter is useful
join cardinalitymemory, spills, join order, broadcast decisions
row widthnetwork, memory, disk I/O
orderingwhether sorts can be avoided
distributionskew and partitioning in distributed engines

Bad estimates produce bad plans.

Common physical access paths:

  • full table scan
  • partition-pruned scan
  • index seek/range scan
  • index-only scan
  • bitmap index scan
  • file metadata skip / data skipping

Generic principle:

The engine can only avoid work when the storage layout, index, partitioning, or file metadata aligns with the predicate.

A predicate is sargable when the engine can use an access path efficiently.

Less friendly:

WHERE DATE(order_ts) = DATE '2026-01-01'

Better:

WHERE order_ts >= TIMESTAMP '2026-01-01 00:00:00'
	AND order_ts < TIMESTAMP '2026-01-02 00:00:00'

Avoid wrapping indexed/filter columns in functions when a range predicate expresses the same logic.

For a conceptual index on (customer_id, order_ts), this is useful:

WHERE customer_id = 42
	AND order_ts >= TIMESTAMP '2026-01-01 00:00:00'

This may be less useful:

WHERE order_ts >= TIMESTAMP '2026-01-01 00:00:00'

Leading columns matter in many row-store indexes. Column stores, data skipping, clustered files, and distributed systems have different mechanics, but the same idea remains: layout must match filters.

JoinGood whenRisk
nested loopouter side small, inner side indexedcatastrophic if both sides large
hash joinequality join, enough memoryspills if build side too large
merge joinboth sides sortedsort cost if not already ordered
broadcast joinone side small in distributed executionexecutor memory blowup if estimate wrong
shuffle hash/sort joinlarge distributed inputsnetwork and skew dominate

When you see a plan, ask:

  1. Which table is read first?
  2. Are filters pushed down?
  3. Are partitions/files pruned?
  4. What join order did the optimizer choose?
  5. Which join algorithm is used?
  6. Are row estimates close to actuals?
  7. Are there sorts, exchanges, repartitions, or spills?
  8. Is the output cardinality plausible at every step?

Generic pseudo-plan:

HashAggregate group by customer_id
  HashJoin orders.customer_id = customers.customer_id
    Filter orders.status = 'paid'
      Scan orders partitions=2026-01
    Scan customers

Strong read:

The paid filter is applied before the join, which is good. I would check whether the order-date predicate prunes partitions, whether customer scan is dimension-sized, and whether the post-join cardinality matches expected paid order rows.

Optimizers need facts.

  • row counts
  • distinct counts
  • histograms or frequency stats
  • NULL fraction
  • min/max metadata
  • primary/foreign key constraints
  • partition/file stats

If estimates are wrong, fixes may include updated stats, better predicates, materialized intermediates, constraints, data layout changes, or query rewrites.

Indexes speed reads but cost writes and storage.

Good index candidates:

  • high-selectivity filters
  • frequent joins
  • stable dimensions
  • common order-by/limit access paths
  • uniqueness constraints

Bad index candidates:

  • low-selectivity booleans alone
  • write-heavy ephemeral columns
  • columns rarely filtered or joined
  • every column “just in case”

In columnar analytics engines, performance often comes from:

  • reading only needed columns
  • partition pruning
  • file-level min/max skipping
  • clustering/sorting/Z-order-like layouts
  • compression
  • vectorized execution
  • avoiding tiny files

The equivalent of an “index conversation” may be a partitioning, clustering, file-size, and compaction conversation.

1. A query filters WHERE DATE(order_ts) = DATE '2026-06-01' and scans the whole table. What rewrite do you propose?

Use a half-open timestamp range.

WHERE order_ts >= TIMESTAMP '2026-06-01 00:00:00'
	AND order_ts < TIMESTAMP '2026-06-02 00:00:00'

This can use indexes, partition pruning, or file metadata on order_ts more effectively than wrapping the column in a function.

2. EXPLAIN shows estimated rows 1,000 but actual rows 200,000,000 after a join. What do you suspect?

Stale or missing stats, correlated predicates, many-to-many join amplification, missing uniqueness constraints, data skew, NULL-heavy join keys, or predicates that the optimizer cannot estimate well. Validate join cardinality and update stats before forcing hints.

3. Why can an index make a query slower?

If the predicate is not selective, many random index lookups can be slower than a sequential scan. Indexes also add write overhead and may mislead the optimizer when stats are bad.

4. What is the staff answer to "add an index"?

First define the query pattern, selectivity, write cost, storage cost, freshness of stats, and whether the engine is row-store, column-store, or distributed. Then choose an access path or layout that improves the dominant workload, and verify with EXPLAIN plus runtime metrics.