← All features
AI Controls & Guardrails

Prompt caching

Live

Stable prompt prefixes are cached. Faster responses, lower bill, passed to you.

What you get
  • Stable-prefix caching against Anthropic's prompt cache
  • Cached vs uncached breakdown in usage log
  • Pass-through pricing — you see the saved cost
  • 5-minute cache TTL with automatic refresh
Overview

What it is.

We cache the stable prefix of our prompts (system prompts, Indian-law context, firm style) against Anthropic's prompt cache. Cache hits cost roughly 10% of a fresh call and respond in roughly half the time.

We pass the savings to you transparently — your usage logs show cached vs uncached tokens and the resulting cost.

How it works

Three steps.
End to end.

01
1. First call

Cache miss — full cost, full latency.

02
2. Subsequent calls

Cache hit on the stable prefix — ~10% cost, ~half latency.

03
3. See it in your usage log

Cached vs uncached tokens broken out per request, so you can verify the savings.

Capabilities

What you get.

  • Stable-prefix caching against Anthropic's prompt cache
  • Cached vs uncached breakdown in usage log
  • Pass-through pricing — you see the saved cost
  • 5-minute cache TTL with automatic refresh
FAQ

Quick answers.

Does my document content get cached?

No — only the stable prefix (system prompts, model instructions, firm style). Your document content is never cached.

Related

More in AI Controls & Guardrails.

No model training on your data
Live

We do not train models on customer content without explicit opt-in. Default is off.

Prompt injection defence
Live

User-pasted text is sanitised before reaching the LLM.

Per-org cost guardrails
Live

Daily and monthly cost ceilings per organisation. Soft warnings, hard cutoffs.

Token cap per request
Live

200K input / 8K output limits with graceful truncation and a clear notice.

Want to try Prompt caching?
Get started in 60 seconds.

Sign up →All features