Storage backends

kassette saves each run in a journal. kassette supports both writing to a filesystem or to an object store using either LocalStorage or RemoteStorage. In some cases, you might even need to implement custom storage using the Storage interface.

Picking a backend

The choice of storage backend should be determined by where the run can resume rather than performance. That’s because in kassette’s agentic workload — 10-100 sequential turns dominated by LLMs, tools, humans, webhooks, or timeouts, with small storage needs — neither storage throughput nor latency will be your bottleneck. For more on what kassette was designed for, see Why kassette.

ScenarioBackend
Same machine recovers the run that started it?YesLocalStorage
Different machine, region, or runtime needs to resume?YesRemoteStorage (S3, R2, GCS)
Running on serverless (Lambda, Workers, Cloud Run jobs)?YesRemoteStorage
Running on persistent infra (long-lived boxes, CI runners with a persistent volume)?EitherLocalStorage preferred

For large entries, hundreds of steps, or journals that grow into multiple gigabytes, prefer LocalStorage when possible.

LocalStorage

new LocalStorage(dir: string)

Writes a JSONL file per run to a directory on disk: {dir}/{runId}.jsonl.

Atomic appends. Each entry is a single write() plus fsync() as one buffer so we never have torn writes.

Single-writer fencing. A spanning lockfile per runId ({dir}/{runId}.lock) created with atomic link(2). PID-based stale detection will allow the next process reclaim a lock from a dead writer. The start entry itself is fenced before the lock span begins, so a process reclaiming a stale lock cannot open a session at a number that has already been superseded. The lock is held from start through the terminal entry so that no other process can write to the journal during a session.

Scope. Use LocalStorage when the next process can read the same directory and share the same lock semantics. (The important bit being that the lock uses local PID checks.)

Performance. Each append is an O(1) op because it writes only the new entry and fsyncs it. The journal size mostly does not change the append cost which is determined by your disk and filesystem. This is the backend to use when the retrying process can read the same persistent filesystem. (It can handle very large journals.)

RemoteStorage

new RemoteStorage(client: ObjectStoreClient, options?: { prefix?: string })

Writes each run as a single object in S3-compatible storage: {prefix}/{runId}.jsonl. Use with @usekassette/s3 for AWS S3, Cloudflare R2, and any other service that is S3-compatible.

Atomic appends. PutObject is atomic by its contract and each write is the full journal as one object.

Single-writer fencing. Conditional PutObject with If-Match: <etag> on the previous object’s etag. A writer that loses the CAS race will re-read the journal and retry. If the re-read shows a higher session number, the writer throws FencedError. Two sessions can briefly coexist, but the older is fenced on its next append. At most one orphaned step may execute before the fence. For the full CAS and fencing design, see Object storage design.

Required backend semantics. Read-after-write consistency with conditional writes (If-Match and If-None-Match). S3, R2, and GCS qualify. Eventually-consistent or CDN-fronted stores break fencing. Two writers can pass the same conditional check against a stale etag and both commit, defeating the single-writer invariant.

Performance. Each append reads the current object and uploads the whole journal with the additional entry line. That makes recovery cheap since readAll is a single GetObject request, but the write cost will grow as the journal does. Total uploaded bytes across a run will end up as O(N²). Write amplification is generally not a problem in agentic workflows.

Custom backends

interface Storage {
  append(runId: string, entry: Entry): Promise<number>;
  readAll(runId: string): Promise<JournalEntry[]>;
  list(): Promise<string[]>;
}

There’s a very small contract to implement if you want to support a different remote backend that is not S3-compatible. For S3-compatible stores beyond what @usekassette/s3 ships, implement ObjectStoreClient and pass it to RemoteStorage. The interface signatures and full semantics live in Reference.

A correct implementation of append needs to be atomic, assign a monotonically increasing offset, and reject writes from superseded sessions by throwing a FencedError. readAll should return every entry in append order. list returns every runId present in storage.

Retention

Neither of the two storage backends does any cleanup. Settled work stays settled so storage will always grow across runs. See Operations for cleanup mechanics, eg, S3 lifecycle policies for RemoteStorage or a cron that sweeps for LocalStorage.