Snapshot Support for Performance #248

New issue

Open

opened 2025-12-28 20:00:19 -08:00 by jwilger · 1 comment

jwilger commented

2025-12-28 20:00:19 -08:00

(Migrated from github.com)

Overview

Optimize state reconstruction for long-lived streams by periodically saving snapshots and starting reconstruction from snapshot instead of version 0.

Depends on: #247 (Performance Benchmarking Suite) - need performance data to determine if snapshots are necessary and what snapshot frequency makes sense.

Design

SnapshotStore Trait

save_snapshot(stream_id, version, state) method
load_snapshot(stream_id) returns (version, state)
Snapshots stored alongside events
Automatic snapshot creation at configurable intervals

Executor Integration

Check for snapshot before reading events
If snapshot exists, start from snapshot version
Apply only events after snapshot

Benchmark-Driven

Use benchmark data to determine optimal snapshot frequency

Acceptance Criteria

Benchmark data reviewed to determine if snapshots needed
SnapshotStore trait defined
Snapshot save/load implemented
Executor loads from snapshot when available
Configurable snapshot frequency
Benchmark documents performance improvement

Migrated from beads issue: eventcore-012

## Overview Optimize state reconstruction for long-lived streams by periodically saving snapshots and starting reconstruction from snapshot instead of version 0. **Depends on**: #247 (Performance Benchmarking Suite) - need performance data to determine if snapshots are necessary and what snapshot frequency makes sense. ## Design ### SnapshotStore Trait - `save_snapshot(stream_id, version, state)` method - `load_snapshot(stream_id)` returns `(version, state)` - Snapshots stored alongside events - Automatic snapshot creation at configurable intervals ### Executor Integration - Check for snapshot before reading events - If snapshot exists, start from snapshot version - Apply only events after snapshot ### Benchmark-Driven - Use benchmark data to determine optimal snapshot frequency ## Acceptance Criteria - [ ] Benchmark data reviewed to determine if snapshots needed - [ ] SnapshotStore trait defined - [ ] Snapshot save/load implemented - [ ] Executor loads from snapshot when available - [ ] Configurable snapshot frequency - [ ] Benchmark documents performance improvement --- *Migrated from beads issue: eventcore-012*

jwilger commented

2026-06-13 10:58:06 -07:00

Owner

Benchmark-gated decision: defer to 1.1.0.

This issue's first acceptance criterion is to review benchmark data to determine whether snapshots are needed. Running cargo bench -p eventcore-bench --bench execute -- execute/single_stream/warm (command execution with state reconstruction, in-memory store, on streams pre-seeded with N events) gives:

Stream length	Execute time (incl. reconstruction)
10 events	~2.0 ms
100 events	~1.8 ms
1000 events	~5.0 ms

Reconstruction is dominated by fixed per-execute overhead through ~1000 events; the marginal per-event cost is only ~3-5 µs. State reconstruction therefore only becomes a bottleneck at tens of thousands of events in a single stream.

EventCore's multi-stream / dynamic-consistency-boundary design discourages giant single-aggregate streams, and the DB-read / peak-memory dimensions are already addressed by batch INSERTs (#360/#362), single-serialization (#361), and streaming reads (#364, ADR-0049). Snapshots would add substantial complexity — a SnapshotStore trait, per-command-state serialization, read-after-version support on EventStore, and opt-in executor wiring — for a cost that is not yet a demonstrated bottleneck.

Decision: defer to the 1.1.0 milestone. Revisit when real-world profiling identifies long-lived streams (10k+ events) as an actual performance problem; at that point the read-after-version capability would build naturally on the streaming-read API introduced in #364.

**Benchmark-gated decision: defer to 1.1.0.** This issue's first acceptance criterion is to review benchmark data to determine whether snapshots are needed. Running `cargo bench -p eventcore-bench --bench execute -- execute/single_stream/warm` (command execution *with* state reconstruction, in-memory store, on streams pre-seeded with N events) gives: | Stream length | Execute time (incl. reconstruction) | | --- | --- | | 10 events | ~2.0 ms | | 100 events | ~1.8 ms | | 1000 events | ~5.0 ms | Reconstruction is dominated by fixed per-execute overhead through ~1000 events; the marginal per-event cost is only ~3-5 µs. State reconstruction therefore only becomes a bottleneck at *tens of thousands* of events in a single stream. EventCore's multi-stream / dynamic-consistency-boundary design discourages giant single-aggregate streams, and the DB-read / peak-memory dimensions are already addressed by batch INSERTs (#360/#362), single-serialization (#361), and streaming reads (#364, ADR-0049). Snapshots would add substantial complexity — a `SnapshotStore` trait, per-command-state serialization, read-after-version support on `EventStore`, and opt-in executor wiring — for a cost that is not yet a demonstrated bottleneck. **Decision:** defer to the 1.1.0 milestone. Revisit when real-world profiling identifies long-lived streams (10k+ events) as an actual performance problem; at that point the read-after-version capability would build naturally on the streaming-read API introduced in #364.

jwilger modified the milestone from 1.0.0 to 1.1.0

2026-06-13 10:58:06 -07:00