logo

BigQuery Adaptive Repartitioning

How BigQuery Adaptive Repartitioning Works

AdminFollow
5 minFeb 28, 2026
Views - 13
BigQuery Adaptive Repartitioning

? The Core Problem

During shuffle:

 
Hash(key) → Partition → Slot
 

If key distribution is uneven:

  • Some partitions are huge

  • Some are tiny

  • Slowest partition determines stage runtime

This causes:

  • Memory pressure

  • Spill

  • Stragglers

  • Stage reattempts


? Adaptive Repartitioning (Conceptually)

BigQuery does dynamic repartitioning during execution when:

  • A partition grows too large

  • A worker becomes a straggler

  • Memory pressure crosses threshold

What Happens Internally

  1. Runtime detects skew

  2. Heavy partition is split into sub-partitions

  3. Work is redistributed across additional slots

  4. Slow stage rebalances

This is sometimes called dynamic fan-out.


? When It Triggers

  • Large GROUP BY cardinality

  • Hot join keys

  • Window functions on skewed keys

  • Large DISTINCT


? Tradeoffs

Adaptive repartitioning:

✅ Reduces worst-case skew
❌ Increases shuffle traffic
❌ Consumes more slots
❌ Increases slot-ms cost

So even when “fixed,” skew is still expensive.


? Important Insight

At PB scale:

Preventing skew is 10x cheaper than letting adaptive repartitioning fix it.

Because repartitioning multiplies network IO.

Comments (0)

No comments yet.

© Copyright 2024. All Rights Reserved by Learningdhara Community LLP