Mini tool · cost control

LLM Cost Calculator

Model, traffic and cache hit rate in - cost per request, per month, per year out. See what prompt caching saves and whether a smaller model pays off.

Inputs

Model

View pricing ↗ Report outdated price

Preset prices last checked June 2026 · standard API rates / 1M tokens

Input tokens / req

Output tokens / req

Requests / day

Requests / user / day

Input price / 1M

Output price / 1M

Cache hit rate cached input ≈ 10% of price

Monthly budget

production · cost projection

Cost per request -

Daily≈ today -

Monthly× 30 days -

Annual× 365 days -

Per active user/ month -

Per 1,000 users/ month -

Saved by cachingper month -

Compare with a smaller model

Pick a model to compare savings.

all models · monthly at these settings

Model	Vendor	In / Out · /1M	Monthly

Preset prices are standard API rates per 1M tokens (input / output), checked early June 2026 against provider pricing pages - they change often, so verify before you rely on a number. Caching assumes cached input reads cost ~10% of the input price (a 90% caching discount, typical for Anthropic; OpenAI cached reads are usually ~25%). Monthly figures use 30 days. Edit any price field to model batch rates, regional premiums or your own negotiated pricing. - LLMOps.si