Issue 01 · Jul 2026 The AI cost dashboard for engineering teams

Live · OTLP native

See every AI dollar your engineers spend.

The AI cost dashboard for engineering teams — real-time spend, ROI, and guardrails across Claude Code, Codex CLI, and Gemini CLI, so your CFO stops asking "what did the AI bill buy us?"

Start free See Exhibit A No credit card · 5-minute install

Reads telemetry from

Claude Code

Codex CLI

Gemini CLI

+ any OTLP-compatible tool

§ 01 · The Problem

Your AI coding bill is growing ten times faster than your visibility into it.

Finance signs a single invoice from three providers and sees a rounded number. Engineering sees a prompt window. Nobody sees the link between the two. So the bill grows and nobody can say whether it's worth it.

In most teams we audit, 20–40% of AI spend is avoidable — devs on Opus when Sonnet performs identically, prompts re-run five times instead of cached, idle subscriptions silently billing. Infercast surfaces every dollar of it.

We didn't catch the leak until the invoice landed. Infercast would have caught it on day two.

— Staff eng, seed-stage SaaS

Fig. 1 · Same team, two months Tearsheet

Before

$12,400/mo

Aggregate bill from three provider dashboards. No per-developer or per-repo breakdown. No cost-per-PR signal.

× "Who is spending what?" — unknown
× Models over-selected (Opus when Sonnet suffices)
× Budget overruns caught after the invoice lands
× CFO asks "what did we get?" → silence

After 30 days

$8,930/mo

Same team, same output — 28% lower after Infercast flagged over-spec'd models, idle seats, and cache misses.

✓ Spend per developer, per PR, per model — live
✓ Routing suggestions with projected savings
✓ Budget thresholds & anomaly alerts before the overrun
✓ Cost-per-PR: a number the CFO actually understands

§ 02 · The Index

Time-to-value, measured honestly.

3 figures

First insight

5min

One curl command configures Claude Code, Codex, and Gemini. First telemetry visible before your coffee cools.

Typical savings

≈ 28%

After 30 days, most teams cut spend by routing over-spec'd model calls and catching idle seats. No throttles, same output.

SDK changes

zero

No agents, no sidecars, no wrappers. We speak OTLP — the protocol your AI tools already emit natively.

§ 03 · Exhibit A

Your AI spend, one screen, always live.

Burn rate, cost-per-PR, top spenders, anomaly alerts, and an AI briefing that tells you what changed — all updated in real time.

acme.infercast.io · live

acme.infercast.io / command-center

Live ingest · 00:00

Today's Burn · Hot pace

$847.32 projected $1,203 · 7d avg $891

Pace +15% vs 7-day avg — see engineering tab for the spike.

Devs

34/42

Sessions

187

Tokens

2.4M

Brief

Spend per PR fell to $4.82 (-11% WoW). Three devs on Opus match Sonnet's acceptance rate — route to save ≈$142/mo. One idle seat on Enterprise tier.

Cost / PR

$4.82

↓ 11% WoW

42 PRs merged (30d)

Cost / Commit

$1.23

148 commits this week

Active Devs

34/42

↑ 4.8% WoW

Tokens Today

2.4M

340K in · 2.1M out

Cost by Tool · 7d

Claude Codex Gemini

Spending Leaderboard 30d

1

Sarah Chen $142.30

2

Marcus Rivera $118.45

3

Jamie Okonkwo $97.18

4

Aisha Patel $84.60

5

Diego Morales $71.22

Fig. 2 · Command Center, production view · updated 00:00 · 34 devs active

§ 04 · Product Tour

Five surfaces. One source of truth.

Every number, list, and chart updates in real time as your team ships.

figs 3–7

Fig. 03 Command Center

Burn rate, pace, and what changed — in a glance.

Hero number is today's burn, color-coded Hot / On-pace / Calm. An AI briefing explains the delta in plain English. Spending leaderboard ranks who's consuming what.

✦Gradient burn hero with pace rail
✦AI-written briefing — why did spend shift?
✦Cost-per-PR, per-commit, per-session

Today's Burn · Hot acme.infercast.io

$847 vs $730 pace

Devs

34

Sessions

187

Tokens

2.4M

Live Ops · 18 cities 23:14:08 UTC

Active

34

Tokens/s

4.2K

Req/min

128

Cities

18

Top hotspots

San Francisco 92

New York 74

London 61

Tokyo 48

Tool mix

Fig. 04 Live Map

Where the work actually happens.

A rotating globe with dev activity arcs, tool-color pulses, and hotspot bars. Hover a city to see active devs and spend — great for founders showing investors global reach.

✦Arc density scales with activity
✦Live prompt stream on the side
✦Drop on a TV for a team pulse display

Fig. 05 Cost Intelligence

Period-over-period spend, without a spreadsheet.

Compare week-to-week, month-to-month, or quarter-to-quarter. Unit economics on the top row — cost per PR, per commit, per 1K tokens. Anomaly days highlighted automatically.

✦Weekly / monthly / quarterly comparison
✦Dev efficiency scatter matrix
✦CFO-ready CSV export (sanitized)

Financial briefing

Week Month Quarter

$29,650 -28% vs prev month

$/PR

$4.82

$/Commit

$1.23

$/Session

$3.40

$/1Kt

$0.21

In flight · this week 42 merged

Draft

7

Review

11

↳ 2 stuck

Approved

4

Merged

42

Reviewer latency (hrs)

sarah.c

P50 2.1h P90 6.5h

marcus.r

P50 5.8h P90 14.2h

jamie.o

P50 1.5h P90 3.8h

Fig. 06 Engineering Intelligence

DORA and AI ROI, together.

Stuck PRs surface in amber. Reviewer latency P50/P90 per person. Cycle-time breakdowns, throughput-vs-cost dual axis. The Monday briefing every eng manager wishes they had.

✦In-flight stage board — stuck PRs highlight
✦Per-developer ROI + quiet-dev detection
✦GitHub PAT integration, multi-repo, scheduled sync

Fig. 07 Signal Tower

Guardrails that fire before the overrun.

Budget thresholds, spend-spike projection, z-score anomaly detection, inactive-dev nudges, off-hours flags. Snoozable. Email + webhook. A noise leaderboard tells you which rules are actually useful.

✦Five built-in rule types, custom webhooks
✦Coverage-gap detector nudges the rules you're missing
✦Ack latency P50/P90 so rules don't rot

Firing now · Hot

ack P50 8.4m

3 unacknowledged · last 24h

$

Budget CRIT 14m

Daily spend $512 / $500

↑

Spike WARN 48m

Projected +62% vs 7d avg

σ

Anomaly WARN 2h

Z-score 2.3 vs 30d mean

§ 05 · Install

Three steps. Five minutes. Zero SDK changes.

Your AI tools already emit OpenTelemetry. Point them at our collector and you're live.

I · II · III

01 Provision

Mint an API key.

Sign up, name your org, generate a key from the dashboard. Free tier: 5 seats, 30-day retention.

# your new key
ic_live_a8f3e2b1…

02 Connect

Run one curl.

One command configures Claude, Codex, and Gemini CLI. Idempotent, no reboot, reversible.

$ curl -sL -X POST \
-d "key=…" \
infercast.io/setup/install | bash

03 Observe

Ship code, see data.

Telemetry flows automatically. The dashboard lights up within seconds of your first AI session.

3 devs active · $23.40 today

§ 06 · Standards

OpenTelemetry native. Vendor lock-free.

Infercast speaks OTLP, not a proprietary agent. Your collector config never mentions our name — swap us out any time.

# One command. All three CLIs configured.
$ curl -sL -X POST -d "key=…" infercast.io/setup/install | bash

[done] ~/.claude/settings.json
[done] ~/.codex/config.toml
[done] ~/.gemini/settings.json
[done] Connection verified (HTTP 200)

→ Dashboard: https://acme.infercast.io

§ Standard

OTLP wire

OpenTelemetry Protocol over HTTPS. Same format every collector understands.

§ Auth

RBAC + SSO

Five roles (owner → viewer), Google OAuth in; SSO/SCIM on Enterprise.

§ Storage

Encrypted secrets

Active Record encryption for integration tokens and webhook secrets.

§ Safety

SSRF + CSV hard

User-supplied URLs validated with DNS-rebinding defense; CSV output sanitized.

§ 07 · Pricing

Start free. Scale when it pays for itself.

No credit card required. Cancel any time — your data exports as CSV.

3 plans

Free

For small teams getting started.

$0

✓5 developer seats
✓Command Center + Live Map
✓30-day data retention
✓Email support

Start free

Most teams start here

Team

For engineering teams.

$8 /seat/mo

✓50 developer seats
✓Cost & Engineering Intelligence
✓Signal Tower alerts & policies
✓GitHub multi-repo integration
✓CSV exports
✓Support team (email + tickets)
✦AI briefings & routing insights

Start free

Enterprise

For large organizations.

Custom

Everything in Team, plus:

✓Unlimited seats
✓SSO / SCIM provisioning
✓Custom retention
✓Dedicated support & SLA
✓Audit logs & impersonation trail
✦Policy governance & model allowlists

Contact sales

Most teams recover their Team-plan cost in the first week through routing insights.

§ 08 · Questions

Answers before you ask.

6 Q&A

Do you see our prompts or code? +

Prompt capture is off by default and fully optional. Without it, we only see metadata: token counts, model, duration, cost, developer identity, and the tool. You can toggle prompt capture per-org, and even then it's stored in your tenant only.

Do I need to install an SDK or change our code? +

No. Claude Code, Codex, and Gemini CLI emit OpenTelemetry natively. Our setup script points their OTLP exporter at our endpoint. That's the full install. Remove us by deleting three lines of config.

Which tools are supported? +

Claude Code, OpenAI Codex CLI, and Google Gemini CLI are first-class today with model-specific cost calculators. Any OTLP-compatible LLM tool will also ingest — we surface unknown models in a bucket so you can see them before we name them.

Can we self-host or run in VPC? +

Enterprise plans include self-hosted and VPC deployments via our Kamal manifest. SaaS is the default path for Free and Team; Enterprise can go either way.

How does pricing work exactly? +

Free is forever-free up to 5 seats with 30-day retention. Team is $8 per active developer per month — we count developers who sent telemetry in the last 30 days, so idle accounts don't bill. Enterprise is custom.

What happens if we cancel? +

One-click CSV export of every table (developers, telemetry summaries, costs, events). We delete all tenant data within 30 days of cancellation. DPA available on request.

§ 09 · Closing

Stop guessing at your AI bill.

Free tier, no credit card, first insight in five minutes. If you don't see value in week one, we failed — not you.

Start free Talk to a human

— The team at Infercast

OTLP native · SSO on Enterprise · CSV export anytime