API ANALYTICS

Token Counting

Input tokens, output tokens, per call. The numbers your provider charges you but never shows you.

Questions this answers

  • How many tokens does my AI coding agent use per request?
  • Track LLM API token usage per session
  • Provider-specific token counting for AI agents
  • How to measure input vs output tokens for Claude Code?

How it works

Token Counting extracts token usage data from every intercepted API call. For providers that include token counts in their response metadata (like OpenAI and Anthropic), Chau7 reads those values directly. For providers that do not report counts, Chau7 applies provider-appropriate tokenization to estimate usage accurately.

Counts are broken down into input tokens (the prompt and context sent to the model) and output tokens (the completion returned). This distinction is critical because most providers price input and output tokens differently, often by a factor of three to five.

All token data is associated with the originating tab, session, and run, so you can aggregate at any level: per call, per run, per session, or globally across all tabs.

Why it matters

Token counts are the fundamental unit of LLM costs, and without per-call tracking you are guessing. Was that session expensive because of one massive prompt or fifty small ones? Chau7 counts input and output tokens for every API call using provider-specific tokenization, so you can pinpoint exactly where the tokens go.

Frequently asked questions

How accurate are the token counts?

When the LLM provider includes token counts in the API response (OpenAI and Anthropic both do), the numbers are exact. For providers that do not report counts, Chau7 uses provider-matched tokenizers that are accurate to within a few percent.

Can I see token counts for individual tool calls within a run?

Yes. Each API call is tracked individually with its own input and output token counts. You can drill into any run to see the per-call breakdown through the MCP server.