Skip to main content
Why does your workflow call the same agent twice with the same inputs? Think about it. If a step takes the same ticket ID, the same description, and the same model version — and the model is deterministic enough for your purposes — the second call is pure waste. You already have the answer. You paid for it once. Caching in Smithers lets you say: “I’ve seen this before. Use the old result.” But here’s the catch. Caching is only safe when you know exactly what “this” means. That’s why Smithers caching is:
  • per-step — not a global switch on the whole workflow
  • explicit — you declare what matters in the cache key
  • derived from declared inputs and dependencies — no hidden state
  • validated against the current output model before reuse — stale shapes are rejected
It should not depend on hidden renderer state or ad hoc prompt hashing.

Step-Level Caching

Caching belongs on the step that produces reusable work. Why per-step and not per-workflow? Because different steps have different purity profiles. A summarization step might be safely cacheable. A deployment step never is. Putting the cache declaration on the step forces you to make that judgment where it matters. For example:
cache: {
  by: ({ input }) => ({
    ticketId: input.ticketId,
    description: input.description,
  }),
  version: "analysis-v1",
}
This says:
  • cache analyze
  • key it from the declared by function
  • invalidate old entries if the algorithm or provider changes by bumping version
That is a better fit than a global workflow-wide cache switch.

Default Mental Model

Here is the one sentence you should internalize:
A step is cacheable when it behaves like a pure function of its declared inputs.
If a step depends on:
  • input
  • needs
  • stable service behavior
then Smithers can safely reuse a previous successful output for the same key. If a step has hidden side effects or reads mutable external state, it should not be cached by default. “But wait,” you might think, “LLM calls aren’t truly pure functions.” Right. They’re not. But for many use cases — summarization, classification, structured extraction — the outputs are stable enough that re-running them is waste, not value. You’re caching the work, not asserting mathematical purity.

What Goes Into the Cache Key

The default cache key should be derived from durable workflow structure plus the explicit cache.by payload. Good key components:
  • workflow name
  • step id
  • output model identity
  • cache.version
  • serialized value returned by cache.by
For example:
cache: {
  by: ({ input, analysis }) => ({
    repo: input.repo,
    summary: analysis.summary,
  }),
  version: "report-v2",
}
Notice what’s not in there: service instances, runtime objects, opaque graphs. Those aren’t serializable, aren’t stable, and aren’t yours to hash. You pick the data that determines the output. Nothing more. This is better than attempting to hash arbitrary service graphs or opaque runtime objects.

Why Explicit Keys Matter

Imagine debugging a production workflow where a step returned a stale result. With magic cache keys, you’d be spelunking through framework internals trying to figure out what the cache thought was “the same.” With explicit keys, you open the step definition and read it. Explicit keys make cache behavior reviewable. You can answer:
  • what exact data invalidates this step?
  • did the provider/model change?
  • does this step depend on hidden filesystem or network state?
  • should this cache survive a workflow refactor?
Magic cache keys tend to break trust because users cannot predict when a step will reuse old work.

Cache Validation

Here’s a subtle problem. You cached a step’s output last week. Since then, you changed the output schema — added a field, tightened a type. The cached bytes still exist, keyed to the same inputs. Should Smithers blindly hand you the old shape? No. When Smithers finds a cache hit, it should still validate the cached payload against the current output model before reusing it. That protects against:
  • model shape changes
  • decoding changes
  • old invalid cache entries
If validation fails, Smithers should treat the entry as a miss and compute a fresh value. This is the safety net that makes caching practical in a system where output schemas evolve.

Example

<Task
  id="summarize"
  output={outputs.summary}
  agent={summarizer}
  deps={{ analysis: outputs.analysis }}
  cache={{
    by: ({ analysis }) => ({
      summary: analysis.summary,
      severity: analysis.severity,
    }),
    version: "summary-v1",
  }}
>
  {({ analysis }) => `Write a current-status summary from this analysis:\n\n${analysis.summary}`}
</Task>
If the same analysis appears again with the same cache version, Smithers can reuse the persisted summary row instead of calling the agent again. Read the by function. It tells you everything: this step’s output depends on the analysis summary and the severity. Change either one, and you get a fresh call. Change neither, and you get the cached result. No guessing.

What Should Not Be Cached

Ask yourself: “If I replayed the cached output instead of running this step, would anything be wrong?” For a summarization step, no — you’d get the same summary. For a deployment step, absolutely yes — you’d skip the actual deploy. Caching is a bad fit for steps that are primarily about side effects. Examples:
  • deploy to production
  • send email
  • mutate a Git branch
  • call an external system whose current state matters
  • open an approval request
Those should either disable caching entirely or use explicit idempotency semantics separate from normal output caching.

Caching and Effect Services

Effect services are not part of the cache key automatically. That is intentional. Service instances are often not serializable or stable. If service behavior affects the output, encode that in the cache version:
cache: {
  by: ({ input }) => ({ prompt: input.prompt }),
  version: "anthropic-sonnet-4-2025-02",
}
When you switch from Sonnet to Opus, bump the version string. The old cache entries become misses. This keeps invalidation under user control. You might wish Smithers could detect the model change automatically. But “which service details matter” is a judgment call only you can make. A logging service change doesn’t invalidate outputs. A model provider change does. Smithers can’t know that — so it asks you to say it.

Runtime Behavior

With caching enabled on a step, Smithers should:
  1. compute the cache key before executing the step
  2. look for a previously successful cached output
  3. validate the cached payload against the output model
  4. if valid, mark the step as completed from cache
  5. otherwise run the step normally and persist the fresh output
From the rest of the workflow’s perspective, a cache hit behaves the same as a completed upstream step. That last point matters. Downstream steps don’t know whether their dependency was computed fresh or pulled from cache. They see a completed step with a valid output. The abstraction is clean.

Resume vs Cache

Resuming and caching are related but distinct.

Resume

Resume reuses outputs from the same execution id. Your workflow crashed at step 4. You restart it. Steps 1 through 3 already succeeded in this run, so Smithers skips them. That’s resume — replaying within a single execution.

Cache

Cache reuses outputs across different executions when the declared cache key matches. You run a new workflow with the same ticket. The analysis step sees a cache hit from last Tuesday’s run. That’s caching — reuse across executions. Resume is about durability. Cache is about recomputation.

Storage

Cached outputs should live in Smithers-managed metadata, keyed by:
  • workflow id
  • step id
  • cache key
  • cache version
This metadata belongs to the framework, not to user domain models.

Suggested Rule

Only cache a step if you would be comfortable describing it as:
“Given these explicit inputs, I want the same persisted output back.”
If that sentence feels false, the step probably should not use caching.

Next Steps

  • Planner Internals — See how the workflow graph is planned and scheduled internally.
  • Execution Model — See where cache lookup happens in the durable step lifecycle.
  • Runtime Events — This page will need cache-hit and cache-miss events in the new design.