perf: streaming reads for large event streams (PostgreSQL/SQLite) #364

Open
opened 2026-04-12 11:56:25 -07:00 by jwilger-ai-bot · 0 comments
jwilger-ai-bot commented 2026-04-12 11:56:25 -07:00 (Migrated from github.com)

Problem

Both PostgreSQL (fetch_all()) and SQLite backends collect the entire event history into a Vec before returning from read_stream(). For large streams (10K+ events), this means:

  • Peak memory = entire stream deserialized in memory at once
  • All deserialization must complete before the first event can be processed
  • No opportunity for the consumer (e.g., state reconstruction fold) to process events incrementally

Proposed Solution

Return a streaming iterator or async stream from read_stream() that deserializes and yields events one at a time (or in small batches). The EventStreamReader type could wrap a stream internally while maintaining the current API for consumers that want to collect.

This would require changes to the EventStore trait's read_stream return type.

Expected Impact

  • Reduced peak memory for large streams
  • Same or slightly better latency (overlapping deserialization with processing)
  • Most impactful for projection catch-up on streams with thousands of events

Caveats

This is an API-level change to the EventStore trait that would affect all backend implementations. The current EventStreamReader with into_iter() may need to become a Stream or provide both collected and streaming modes.

Location

  • eventcore-types/src/store.rsEventStore::read_stream return type, EventStreamReader
  • All backend implementations

Benchmark Baseline

Run cargo bench -p eventcore-bench --bench store_operations -- 'store/read_stream' to measure before/after.

## Problem Both PostgreSQL (`fetch_all()`) and SQLite backends collect the entire event history into a `Vec` before returning from `read_stream()`. For large streams (10K+ events), this means: - Peak memory = entire stream deserialized in memory at once - All deserialization must complete before the first event can be processed - No opportunity for the consumer (e.g., state reconstruction fold) to process events incrementally ## Proposed Solution Return a streaming iterator or async stream from `read_stream()` that deserializes and yields events one at a time (or in small batches). The `EventStreamReader` type could wrap a stream internally while maintaining the current API for consumers that want to collect. This would require changes to the `EventStore` trait's `read_stream` return type. ## Expected Impact - Reduced peak memory for large streams - Same or slightly better latency (overlapping deserialization with processing) - Most impactful for projection catch-up on streams with thousands of events ## Caveats This is an API-level change to the `EventStore` trait that would affect all backend implementations. The current `EventStreamReader` with `into_iter()` may need to become a `Stream` or provide both collected and streaming modes. ## Location - `eventcore-types/src/store.rs` — `EventStore::read_stream` return type, `EventStreamReader` - All backend implementations ## Benchmark Baseline Run `cargo bench -p eventcore-bench --bench store_operations -- 'store/read_stream'` to measure before/after.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
jwilger/eventcore#364
No description provided.