logo

BigQuery columnar storage

How BigQuery Internally Stores Column Chunks

AdminFollow
5 minFeb 28, 2026
Views - 12
BigQuery columnar storage

BigQuery uses columnar storage inspired by Dremel.


? Storage Model

Each table is:

  • Column-oriented

  • Split into storage blocks

  • Compressed

  • Distributed


? Column Chunks

Each column:

  • Stored separately

  • Broken into chunks

  • Metadata tracks min/max values

This enables:

  • Column pruning

  • Predicate pushdown

  • Partition pruning


? Why SELECT * Is Expensive

Because:

  • BigQuery reads every column file

  • Even unused ones

  • Increases IO

Better:

 
SELECT user_id, event_type
 

? Nested & Repeated Fields

BigQuery stores nested fields in a flattened columnar structure using repetition/definition levels.

Benefits:

  • Avoids joins

  • Reduces shuffle

  • Improves performance

Denormalization is encouraged for this reason.

Comments (0)

No comments yet.

© Copyright 2024. All Rights Reserved by Learningdhara Community LLP