BQ Event Architecture

Designing a 1PB Event Architecture

AdminFollow

5 min•Feb 28, 2026

Views - 16

? Assumptions

1PB raw events
Billions of events/day
Real-time + batch analytics
100+ concurrent users

? Layered Architecture

? Layer 1: Raw Events

Table:

events_raw

Design:

Partition by event_date
Cluster by user_id
Nested schema
No joins required

Keep raw immutable.

? Layer 2: Enriched Events

Denormalize here.

Instead of:

JOIN users
JOIN products
JOIN campaigns

Flatten during ingestion.

Reduces shuffle later.

? Layer 3: Aggregation Tables

Create:

Daily user metrics
Session-level aggregates
Campaign performance rollups

BI never hits raw tables.

? Storage Strategy

At 1PB:

Partition by date (mandatory)
Consider multi-column clustering
Monitor partition size (avoid tiny partitions)

? Streaming vs Batch

Streaming:

Higher cost
Lower latency
Micro-partitioned storage

Batch load:

Cheaper
Better compression

For PB-scale:
→ Prefer batch ingestion where possible.

? Schema Design Principles

Use:

Nested/repeated fields
Avoid snowflake schema
Avoid small dimension joins

Denormalization reduces shuffle massively.

? Biggest Cost Driver at 1PB

Not storage.

It’s shuffle-heavy ad-hoc joins on raw data.

Comments (0)

No comments yet.

BQ Event Architecture

Designing a 1PB Event Architecture

? Assumptions

? Layered Architecture

? Layer 1: Raw Events

? Layer 2: Enriched Events

? Layer 3: Aggregation Tables

? Storage Strategy

? Streaming vs Batch

? Schema Design Principles

? Biggest Cost Driver at 1PB

Comments (0)

Course

Service

Get In Touch

Technical Skills

Analytical Skills

Business Skills

Career Resources