Caching

Why does your workflow call the same agent twice with the same inputs? Think about it. If a step takes the same ticket ID, the same description, and the same model version — and the model is deterministic enough for your purposes — the second call is pure waste. You already have the answer. You paid for it once. Caching in Smithers lets you say: “I’ve seen this before. Use the old result.” But here’s the catch. Caching is only safe when you know exactly what “this” means. That’s why Smithers caching is:

per-step — not a global switch on the whole workflow
explicit — you declare what matters in the cache key
derived from declared inputs and dependencies — no hidden state
validated against the current output model before reuse — stale shapes are rejected

It should not depend on hidden renderer state or ad hoc prompt hashing.

Step-Level Caching

Caching belongs on the step that produces reusable work. Why per-step and not per-workflow? Because different steps have different purity profiles. A summarization step might be safely cacheable. A deployment step never is. Putting the cache declaration on the step forces you to make that judgment where it matters. For example:

cache: {
  by: ({ input }) => ({
    ticketId: input.ticketId,
    description: input.description,
  }),
  version: "analysis-v1",
}

This says:

cache analyze
key it from the declared by function
invalidate old entries if the algorithm or provider changes by bumping version

That is a better fit than a global workflow-wide cache switch.

Default Mental Model

Here is the one sentence you should internalize:

A step is cacheable when it behaves like a pure function of its declared inputs.

If a step depends on:

input
needs
stable service behavior

then Smithers can safely reuse a previous successful output for the same key. If a step has hidden side effects or reads mutable external state, it should not be cached by default. “But wait,” you might think, “LLM calls aren’t truly pure functions.” Right. They’re not. But for many use cases — summarization, classification, structured extraction — the outputs are stable enough that re-running them is waste, not value. You’re caching the work, not asserting mathematical purity.

What Goes Into the Cache Key

The default cache key should be derived from durable workflow structure plus the explicit cache.by payload. Good key components:

workflow name
step id
output model identity
cache.version
serialized value returned by cache.by

For example:

cache: {
  by: ({ input, analysis }) => ({
    repo: input.repo,
    summary: analysis.summary,
  }),
  version: "report-v2",
}

Notice what’s not in there: service instances, runtime objects, opaque graphs. Those aren’t serializable, aren’t stable, and aren’t yours to hash. You pick the data that determines the output. Nothing more. This is better than attempting to hash arbitrary service graphs or opaque runtime objects.

Why Explicit Keys Matter

Imagine debugging a production workflow where a step returned a stale result. With magic cache keys, you’d be spelunking through framework internals trying to figure out what the cache thought was “the same.” With explicit keys, you open the step definition and read it. Explicit keys make cache behavior reviewable. You can answer:

what exact data invalidates this step?
did the provider/model change?
does this step depend on hidden filesystem or network state?
should this cache survive a workflow refactor?

Magic cache keys tend to break trust because users cannot predict when a step will reuse old work.

Cache Validation

Here’s a subtle problem. You cached a step’s output last week. Since then, you changed the output schema — added a field, tightened a type. The cached bytes still exist, keyed to the same inputs. Should Smithers blindly hand you the old shape? No. When Smithers finds a cache hit, it should still validate the cached payload against the current output model before reusing it. That protects against:

model shape changes
decoding changes
old invalid cache entries

If validation fails, Smithers should treat the entry as a miss and compute a fresh value. This is the safety net that makes caching practical in a system where output schemas evolve.

Example

<Task
  id="summarize"
  output={outputs.summary}
  agent={summarizer}
  deps={{ analysis: outputs.analysis }}
  cache={{
    by: ({ analysis }) => ({
      summary: analysis.summary,
      severity: analysis.severity,
    }),
    version: "summary-v1",
  }}
>
  {({ analysis }) => `Write a current-status summary from this analysis:\n\n${analysis.summary}`}
</Task>

If the same analysis appears again with the same cache version, Smithers can reuse the persisted summary row instead of calling the agent again. Read the by function. It tells you everything: this step’s output depends on the analysis summary and the severity. Change either one, and you get a fresh call. Change neither, and you get the cached result. No guessing.

What Should Not Be Cached

Ask yourself: “If I replayed the cached output instead of running this step, would anything be wrong?” For a summarization step, no — you’d get the same summary. For a deployment step, absolutely yes — you’d skip the actual deploy. Caching is a bad fit for steps that are primarily about side effects. Examples:

deploy to production
send email
mutate a Git branch
call an external system whose current state matters
open an approval request

Those should either disable caching entirely or use explicit idempotency semantics separate from normal output caching.

Caching and Effect Services

Effect services are not part of the cache key automatically. That is intentional. Service instances are often not serializable or stable. If service behavior affects the output, encode that in the cache version:

cache: {
  by: ({ input }) => ({ prompt: input.prompt }),
  version: "anthropic-sonnet-4-2025-02",
}

When you switch from Sonnet to Opus, bump the version string. The old cache entries become misses. This keeps invalidation under user control. You might wish Smithers could detect the model change automatically. But “which service details matter” is a judgment call only you can make. A logging service change doesn’t invalidate outputs. A model provider change does. Smithers can’t know that — so it asks you to say it.

Runtime Behavior

With caching enabled on a step, Smithers should:

compute the cache key before executing the step
look for a previously successful cached output
validate the cached payload against the output model
if valid, mark the step as completed from cache
otherwise run the step normally and persist the fresh output

From the rest of the workflow’s perspective, a cache hit behaves the same as a completed upstream step. That last point matters. Downstream steps don’t know whether their dependency was computed fresh or pulled from cache. They see a completed step with a valid output. The abstraction is clean.

Resume vs Cache

Resuming and caching are related but distinct.

Resume

Resume reuses outputs from the same execution id. Your workflow crashed at step 4. You restart it. Steps 1 through 3 already succeeded in this run, so Smithers skips them. That’s resume — replaying within a single execution.

Cache

Cache reuses outputs across different executions when the declared cache key matches. You run a new workflow with the same ticket. The analysis step sees a cache hit from last Tuesday’s run. That’s caching — reuse across executions. Resume is about durability. Cache is about recomputation.

Storage

Cached outputs should live in Smithers-managed metadata, keyed by:

workflow id
step id
cache key
cache version

This metadata belongs to the framework, not to user domain models.

Suggested Rule

Only cache a step if you would be comfortable describing it as:

“Given these explicit inputs, I want the same persisted output back.”

If that sentence feels false, the step probably should not use caching.

Next Steps

Planner Internals — See how the workflow graph is planned and scheduled internally.
Execution Model — See where cache lookup happens in the durable step lifecycle.
Runtime Events — This page will need cache-hit and cache-miss events in the new design.

Getting Started

Core Concepts

Runtime

Integrations

Components

Guides

Examples

Reference

Step-Level Caching

Default Mental Model

What Goes Into the Cache Key

Why Explicit Keys Matter

Cache Validation

Example

What Should Not Be Cached

Caching and Effect Services

Runtime Behavior

Resume vs Cache

Resume

Cache

Storage

Suggested Rule

Next Steps

Getting Started

Core Concepts

Runtime

Integrations

Components

Guides

Examples

Reference

​Step-Level Caching

​Default Mental Model

​What Goes Into the Cache Key

​Why Explicit Keys Matter

​Cache Validation

​Example

​What Should Not Be Cached

​Caching and Effect Services

​Runtime Behavior

​Resume vs Cache

​Resume

​Cache

​Storage

​Suggested Rule

​Next Steps

Step-Level Caching

Default Mental Model

What Goes Into the Cache Key

Why Explicit Keys Matter

Cache Validation

Example

What Should Not Be Cached

Caching and Effect Services

Runtime Behavior

Resume vs Cache

Resume

Cache

Storage

Suggested Rule

Next Steps