logo

Event-Driven Near-Real-Time Global Lakehouse

Event-Driven Near-Real-Time Global Lakehouse

AdminFollow
8 readFeb 28, 2026
Event-Driven Near-Real-Time Global Lakehouse

Now we combine streaming + lakehouse + warehouse.


🔹 Global Streaming Layer

Each region:

  • Real-time ingestion (Pub/Sub/Kafka equivalent)

  • Micro-batch landing tables

  • Partition by event_time


🔹 Micro-Batch Strategy

Avoid pure row-by-row streaming at 100PB scale.

Instead:

  • 1–5 minute micro-batches

  • Merge into partitioned tables

  • Periodic compaction

Reduces storage fragmentation.


🔹 Near-Real-Time Serving Pattern

  1. Raw stream → landing table

  2. Transform → enriched table

  3. Update aggregates incrementally

  4. BI hits incremental aggregates

Target SLA:

  • 2–10 minute freshness

  • Not sub-second


🔥 Global Coordination Pattern

Each region publishes:

  • Aggregated hourly metrics

  • Feature deltas

  • Business KPIs

Global reporting aggregates regional outputs.

Never stream raw globally.

Comments (0)

No comments yet.

© Copyright 2024. All Rights Reserved by Learningdhara Community LLP