Embeddable durability primitives

Retry agent workflows without repeating work.

kassette is a tiny, zero-dependency TypeScript library that makes agentic workflows durable. It journals completed steps, then replays them on retry after a crash, timeout, or redeploy.

Docs

What it does

It makes retries safe.

  • Replay finished steps

    On retry, kassette replays the journal and returns recorded results instead of calling the model or tool again.

  • Wait without running

    Suspend for a human, webhook, or CI job. The process exits, then replay continues when the event arrives.

  • Use your existing retry system

    kassette is a library, not a runtime. Your queue, job runner, or webhook re-invokes the same runId.

  • Keep state in a plain journal

    Each run is a readable JSONL journal on a filesystem or object store. It is the state, audit trail, and resume point.

Example

Normal async code, durable steps.

import { kassette, LocalStorage } from '@usekassette/kassette';

const agent = kassette(async (ctx, ticket) => {
  const analysis = await ctx.step('analyze', () =>
    llm.chat('Diagnose this issue and recommend a fix', { ticket })
  );

  if (analysis.destructive) {
    // process can exit here; resume from anywhere via agent.resume()
    const approval = await ctx.suspend('human-approval');
    if (!approval.approved) return { outcome: 'skipped', reason: approval.notes };
  }

  const result = await ctx.step('apply-fix', () =>
    executeTool(analysis.suggestedAction)
  );
  return { outcome: 'resolved', result };
}, { storage: new LocalStorage('.kassette') });

await agent.start(ticket);
.kassette/INC-4821.jsonl
{"type":"start","session":1,"offset":0,"timestamp":"2026-05-08T14:21:03Z","metadata":{"ticket":{"id":"INC-4821","title":"Pod crashing on startup"}}}
+ 5 more lines
{"type":"step","session":1,"offset":1,"timestamp":"2026-05-08T14:21:08Z","stepId":"analyze","name":"analyze","result":{"destructive":true,"suggestedAction":"restart-pod-7f3c","rationale":"OOM during init; restart releases stuck handle"}}
{"type":"suspend","session":1,"offset":2,"timestamp":"2026-05-08T14:21:08Z","reason":"Waiting for event: human-approval","waitingFor":"human-approval"}
{"type":"resume","session":2,"offset":3,"timestamp":"2026-05-08T14:47:12Z","eventName":"human-approval","value":{"approved":true,"notes":""}}
{"type":"step","session":2,"offset":4,"timestamp":"2026-05-08T14:47:14Z","stepId":"apply-fix","name":"apply-fix","result":{"ok":true,"podId":"7f3c"}}
{"type":"complete","session":2,"offset":5,"timestamp":"2026-05-08T14:47:14Z"}

Use it when

Skip the work you've already done.

Reach for kassette when the problem is not 'how do I run this again?' but 'how do I avoid doing the same work twice?' Your existing stack makes retries easy but not safe.