Architecture
For the design rationale behind kassette, see Why kassette.
1. The model
Each run has one journal, and that journal is the only source of truth.
Every invocation starts by reading the journal from the beginning to rebuild run state before running the workflow from the top. If a step is already recorded, kassette returns the recorded result instead of running it again. Execution goes live at the first unrecorded step, and new completed work is appended.
2. Concepts
A run is one execution of a task, identified by a runId. It owns exactly one journal, and that journal is isolated from other runs at the storage level. A run either reaches a terminal state (completed, failed, cancelled) or remains open so the caller can invoke it again with the same runId to pick up where it left off.
A session is one continuous execution attempt within a run, identified by a monotonically increasing session number. Every new session writes a session number greater than every number already in the run’s journal. An initial invocation, a retry after an interruption, and a resume after a suspend all open new sessions.
A step is one unit of recorded work inside a session. Each record() call is a step. The first time it runs, kassette runs the function and writes the result to the journal. In later sessions, kassette returns the journaled result instead, so the function does not run again.
3. Lifecycle
Run "task-abc"
Session 1: step, step, step → crash
Session 2: replay(3), step, step → suspend(waiting on CI pipeline)
↳ process exits, releases all resources
↳ caller re-invokes with the resume payload
Session 3: replay(5), step → complete
A crash appends nothing. If the journal has no terminal entry, the run is still open and may be continued by a later session.
The state machine is derivable from the journal alone.
┌──step──┐
│ │
▼ │
┌─────┐ start ┌────────────────┐────┘ complete ╔═══════════════╗
│ new │──────────▶│ unsettled │────────────────▶ ║ completed ║
└─────┘ ┌────▶└─┬───────────┬──┘ ╚═══════════════╝
│ │ │
│ │ suspend │ fail ╔═══════════════╗
│ │ └───────────────────▶ ║ failed ║
resume │ │ ╚═══════════════╝
│ ▼
│ ┌────────────────┐
└───│ suspended │
└───────┬────────┘
│ timeout ╔═══════════════╗
└───────────────────────────▶ ║ cancelled ║
╚═══════════════╝
Read the arrows as lifecycle transitions:
startopens a session.steprecords completed work and leaves the run open.suspendends the current session while the run waits for an event.resumeopens a new session and records the event payload.completewrites acompleteentry, making the runcompleted.failwrites anerrorentry, making the runfailed.timeoutis checked only when a laterstart,resume, orforkinitializes the run; if the suspend deadline has expired, kassette writescancel, making the runcancelled.
Only completed, failed, and cancelled are terminal states.
4. Replay
Each new session runs the workflow again from the top, but previously recorded work is not repeated.
When record(name, fn) is reached, kassette looks for the matching journaled step. If it finds one, it returns the recorded result and skips fn. If it does not, fn will run and its result is appended as a new step.
Each waitForEvent(name) call gets checked in the same way. If the journal already has a matching resume then kassette simply returns the recorded value. Otherwise, it writes suspend and unwinds the workflow so the process can exit.
Replay is correct only if each session reaches the same record() and waitForEvent() calls in the same order. Step IDs are positional (name, name#2, name#3), so removing, reordering, or conditionally skipping a call can attach an old result to the wrong code. Concurrent branches need to add scope-based namespacing, see Concurrency.
Any non-deterministic work, such as LLM calls or tool calls, should be wrapped in record() calls.
If the workflow’s code has been modified between sessions then that can also shift the call order and break replay silently. The optional version field on start() is a deployment-level guard to protect against this, but it only signals a version mismatch without identifying the changed step or preventing all drift. See Versioning for more discussion on this.
5. Suspend, resume, and timeouts
When you need a workflow to wait on something, like human approval or a webhook, you can suspend it without keeping the process alive until you’re ready to resume it. This is done through waitForEvent and resume.
To suspend, use waitForEvent to write a suspend entry under the current session. It’ll throw an exception to return control to the caller so that the process can exit.
To resume, call resume(runId, name, value) to open a new session and record a resume entry with the event value passed in. On replay, waitForEvent(name) will find that entry and return its value instead of suspending again. (Calling resume more than once for the same event is safe because the first recorded value wins, and later values are ignored.)
A suspend may include a deadline, but note that kassette doesn’t poll or run timers. Deadlines are only checked when a new session is opened by start, resume, or fork. If the deadline has passed and no matching resume exists, kassette writes a cancel entry and the run becomes terminal.
6. Properties of the journal
Append-only. Entries are never modified or deleted. Once work is recorded, it stays settled.
Ordered. Entries are read in append order. Replay depends on this because recorded results are matched to record() calls by position.
Atomic. Each entry is written completely or not at all.
Fenced. Only the current session can append. Writes from a superseded session are rejected.
JSON. The journal is newline-delimited JSON with one entry per line and one file per run. Inspection requires no library and no kassette dependency.
Self-contained. Everything needed for replay is in the journal. As long as you can access the journal, the run can continue.
Per-run isolated. Each runId has its own journal, so unrelated runs cannot interfere with each other.
7. The single-writer invariant
Only the newest session for a run may append to its journal.
Whenever a session is opened, it gets a higher session number than any earlier session, and each entry appended during that session will include that number. Before appending, the storage backend checks whether the journal already contains a start entry with a higher session number. If it does, the append gets rejected with a FencedError, so zombie sessions are unable to write.
The local backend enforces this with a per-run lock file ({runId}.lock) so only one local process can write at a time. If the lock owner has died, the next process can reclaim, but only after checking the journal for a newer session in order to prevent an old session from writing again after being superseded.
The remote object storage backend uses CAS instead of a lock. Each run is one object. In order to append, kassette must read the object and its ETag, then writes back the full journal with If-Match: <etag>. If another writer got there first, kassette retries; if the retry sees a higher session number, the writer is stale and exits with FencedError. For the full CAS and session-number fencing design, see Object storage design.
8. Correctness boundaries
kassette guarantees at-most-once journaling, not at-most-once execution.
If a step performs an external action and the process crashes before the result is written to the journal, kassette has no record of that action. On the next session, replay will run the step again.
With remote storage, an old session may also finish work it already started before it learns that a newer session has taken over. Its next journal write will be rejected, but the external action may already have happened.
What that means is that for irreversible external actions, you must make sure to use idempotency keys or another method of deduplication. kassette makes retries safe only after the result is journaled.