Issue 01 · Jun 2026
Live · OTLP native

See every AI dollar your engineers spend.

The AI cost dashboard for engineering teams — real-time spend, ROI, and guardrails across Claude Code, Codex CLI, and Gemini CLI, so your CFO stops asking "what did the AI bill buy us?"

Start free See Exhibit A
Reads telemetry from
Claude Code
Codex CLI
Gemini CLI
+ any OTLP-compatible tool
§ 01 · The Problem

Your AI coding bill is growing ten times faster than your visibility into it.

Finance signs a single invoice from three providers and sees a rounded number. Engineering sees a prompt window. Nobody sees the link between the two. So the bill grows and nobody can say whether it's worth it.

In most teams we audit, 20–40% of AI spend is avoidable — devs on Opus when Sonnet performs identically, prompts re-run five times instead of cached, idle subscriptions silently billing. Infercast surfaces every dollar of it.

We didn't catch the leak until the invoice landed. Infercast would have caught it on day two.
— Staff eng, seed-stage SaaS
Fig. 1 · Same team, two months
Before
$12,400/mo

Aggregate bill from three provider dashboards. No per-developer or per-repo breakdown. No cost-per-PR signal.

  • × "Who is spending what?" — unknown
  • × Models over-selected (Opus when Sonnet suffices)
  • × Budget overruns caught after the invoice lands
  • × CFO asks "what did we get?" → silence
After 30 days
$8,930/mo

Same team, same output — 28% lower after Infercast flagged over-spec'd models, idle seats, and cache misses.

  • Spend per developer, per PR, per model — live
  • Routing suggestions with projected savings
  • Budget thresholds & anomaly alerts before the overrun
  • Cost-per-PR: a number the CFO actually understands
§ 02 · The Index

Time-to-value, measured honestly.

First insight
5min

One curl command configures Claude Code, Codex, and Gemini. First telemetry visible before your coffee cools.

Typical savings
≈ 28%

After 30 days, most teams cut spend by routing over-spec'd model calls and catching idle seats. No throttles, same output.

SDK changes
zero

No agents, no sidecars, no wrappers. We speak OTLP — the protocol your AI tools already emit natively.

§ 03 · Exhibit A

Your AI spend, one screen, always live.

Burn rate, cost-per-PR, top spenders, anomaly alerts, and an AI briefing that tells you what changed — all updated in real time.

acme.infercast.io · live
acme.infercast.io / command-center
Live ingest ·
Today's Burn · Hot pace
$847.32 projected $1,203 · 7d avg $891

Pace +15% vs 7-day avg — see engineering tab for the spike.

Devs
34/42
Sessions
187
Tokens
2.4M
Brief

Spend per PR fell to $4.82 (-11% WoW). Three devs on Opus match Sonnet's acceptance rate — route to save ≈$142/mo. One idle seat on Enterprise tier.

Cost / PR
$4.82
↓ 11% WoW
42 PRs merged (30d)
Cost / Commit
$1.23
148 commits this week
Active Devs
34/42
↑ 4.8% WoW
Tokens Today
2.4M
340K in · 2.1M out
Cost by Tool · 7d
Claude Codex Gemini
$600 $400 $200 Mon Tue Wed Thu Fri Sat Sun
Spending Leaderboard 30d
1
Sarah Chen $142.30
2
Marcus Rivera $118.45
3
Jamie Okonkwo $97.18
4
Aisha Patel $84.60
5
Diego Morales $71.22
Fig. 2 · Command Center, production view · updated · 34 devs active
§ 04 · Product Tour

Five surfaces. One source of truth.

Every number, list, and chart updates in real time as your team ships.

Fig. 03 Command Center

Burn rate, pace, and what changed — in a glance.

Hero number is today's burn, color-coded Hot / On-pace / Calm. An AI briefing explains the delta in plain English. Spending leaderboard ranks who's consuming what.

  • Gradient burn hero with pace rail
  • AI-written briefing — why did spend shift?
  • Cost-per-PR, per-commit, per-session
Today's Burn · Hot acme.infercast.io
$847 vs $730 pace
Devs
34
Sessions
187
Tokens
2.4M
Live Ops · 18 cities 23:14:08 UTC
Active
34
Tokens/s
4.2K
Req/min
128
Cities
18
Top hotspots
San Francisco 92
New York 74
London 61
Tokyo 48
Tool mix
Fig. 04 Live Map

Where the work actually happens.

A rotating globe with dev activity arcs, tool-color pulses, and hotspot bars. Hover a city to see active devs and spend — great for founders showing investors global reach.

  • Arc density scales with activity
  • Live prompt stream on the side
  • Drop on a TV for a team pulse display
Fig. 05 Cost Intelligence

Period-over-period spend, without a spreadsheet.

Compare week-to-week, month-to-month, or quarter-to-quarter. Unit economics on the top row — cost per PR, per commit, per 1K tokens. Anomaly days highlighted automatically.

  • Weekly / monthly / quarterly comparison
  • Dev efficiency scatter matrix
  • CFO-ready CSV export (sanitized)
Financial briefing
Week Month Quarter
$29,650 -28% vs prev month
$/PR
$4.82
$/Commit
$1.23
$/Session
$3.40
$/1Kt
$0.21
In flight · this week 42 merged
Draft
7
Review
11
↳ 2 stuck
Approved
4
Merged
42
Reviewer latency (hrs)
sarah.c
P50 2.1h P90 6.5h
marcus.r
P50 5.8h P90 14.2h
jamie.o
P50 1.5h P90 3.8h
Fig. 06 Engineering Intelligence

DORA and AI ROI, together.

Stuck PRs surface in amber. Reviewer latency P50/P90 per person. Cycle-time breakdowns, throughput-vs-cost dual axis. The Monday briefing every eng manager wishes they had.

  • In-flight stage board — stuck PRs highlight
  • Per-developer ROI + quiet-dev detection
  • GitHub PAT integration, multi-repo, scheduled sync
Fig. 07 Signal Tower

Guardrails that fire before the overrun.

Budget thresholds, spend-spike projection, z-score anomaly detection, inactive-dev nudges, off-hours flags. Snoozable. Email + webhook. A noise leaderboard tells you which rules are actually useful.

  • Five built-in rule types, custom webhooks
  • Coverage-gap detector nudges the rules you're missing
  • Ack latency P50/P90 so rules don't rot
Firing now · Hot
ack P50 8.4m
3 unacknowledged · last 24h
$
Budget CRIT 14m
Daily spend $512 / $500
Spike WARN 48m
Projected +62% vs 7d avg
σ
Anomaly WARN 2h
Z-score 2.3 vs 30d mean
§ 05 · Install

Three steps. Five minutes. Zero SDK changes.

Your AI tools already emit OpenTelemetry. Point them at our collector and you're live.

01 Provision

Mint an API key.

Sign up, name your org, generate a key from the dashboard. Free tier: 5 seats, 30-day retention.

# your new key
ic_live_a8f3e2b1…
02 Connect

Run one curl.

One command configures Claude, Codex, and Gemini CLI. Idempotent, no reboot, reversible.

$ curl -sL -X POST \
  -d "key=…" \
  infercast.io/setup/install | bash
03 Observe

Ship code, see data.

Telemetry flows automatically. The dashboard lights up within seconds of your first AI session.

3 devs active · $23.40 today
§ 06 · Standards

OpenTelemetry native. Vendor lock-free.

Infercast speaks OTLP, not a proprietary agent. Your collector config never mentions our name — swap us out any time.

# One command. All three CLIs configured.
$ curl -sL -X POST -d "key=…" infercast.io/setup/install | bash

[done] ~/.claude/settings.json
[done] ~/.codex/config.toml
[done] ~/.gemini/settings.json
[done] Connection verified (HTTP 200)

→ Dashboard: https://acme.infercast.io
§ Standard

OTLP wire

OpenTelemetry Protocol over HTTPS. Same format every collector understands.

§ Auth

RBAC + SSO

Five roles (owner → viewer), Google OAuth in; SSO/SCIM on Enterprise.

§ Storage

Encrypted secrets

Active Record encryption for integration tokens and webhook secrets.

§ Safety

SSRF + CSV hard

User-supplied URLs validated with DNS-rebinding defense; CSV output sanitized.

§ 07 · Pricing

Start free. Scale when it pays for itself.

No credit card required. Cancel any time — your data exports as CSV.

Free

For small teams getting started.

$0
  • 5 developer seats
  • Command Center + Live Map
  • 30-day data retention
  • Email support
Start free
Enterprise

For large organizations.

Custom

Everything in Team, plus:

  • Unlimited seats
  • SSO / SCIM provisioning
  • Custom retention
  • Dedicated support & SLA
  • Audit logs & impersonation trail
  • Policy governance & model allowlists
Contact sales

Most teams recover their Team-plan cost in the first week through routing insights.

§ 08 · Questions

Answers before you ask.

Do you see our prompts or code? +
Prompt capture is off by default and fully optional. Without it, we only see metadata: token counts, model, duration, cost, developer identity, and the tool. You can toggle prompt capture per-org, and even then it's stored in your tenant only.
Do I need to install an SDK or change our code? +
No. Claude Code, Codex, and Gemini CLI emit OpenTelemetry natively. Our setup script points their OTLP exporter at our endpoint. That's the full install. Remove us by deleting three lines of config.
Which tools are supported? +
Claude Code, OpenAI Codex CLI, and Google Gemini CLI are first-class today with model-specific cost calculators. Any OTLP-compatible LLM tool will also ingest — we surface unknown models in a bucket so you can see them before we name them.
Can we self-host or run in VPC? +
Enterprise plans include self-hosted and VPC deployments via our Kamal manifest. SaaS is the default path for Free and Team; Enterprise can go either way.
How does pricing work exactly? +
Free is forever-free up to 5 seats with 30-day retention. Team is $8 per active developer per month — we count developers who sent telemetry in the last 30 days, so idle accounts don't bill. Enterprise is custom.
What happens if we cancel? +
One-click CSV export of every table (developers, telemetry summaries, costs, events). We delete all tenant data within 30 days of cancellation. DPA available on request.
§ 09 · Closing

Stop guessing at your AI bill.

Free tier, no credit card, first insight in five minutes. If you don't see value in week one, we failed — not you.

— The team at Infercast

OTLP native · SSO on Enterprise · CSV export anytime