Glossary

AI Cost Glossary

Everything you need to know about AI agent costs, token pricing, and cost optimization — explained clearly.

AI Agent Costs

The total expenses incurred when running autonomous AI agents, including token consumption, API calls, compute resources, and orchestration overhead.

Token Pricing

The pricing model used by LLM providers where costs are calculated based on the number of tokens (text fragments) processed in API requests and responses.

LLM API Costs

The expenses incurred when using large language model APIs, including token charges, rate limit considerations, and infrastructure costs for integrating LLM capabilities.

Cost Per Token

The unit price charged by LLM providers for each token processed, typically measured in dollars per million tokens for both input and output.

OpenAI Pricing

The pricing structure for OpenAI's API products including GPT-4o, GPT-4o-mini, o1, DALL-E, and other models, based on per-token and per-request charges.

Anthropic Pricing

The pricing structure for Anthropic's Claude API models including Claude 3.5 Sonnet, Claude 3.5 Haiku, and Claude 3 Opus, based on per-token charges.

AI Observability

The practice of monitoring, tracing, and understanding the behavior, performance, and costs of AI systems in production through logs, metrics, and traces.

AI Cost Optimization

Strategies and techniques for reducing AI infrastructure and API spending while maintaining or improving output quality, including model routing, caching, and prompt engineering.

Prompt Caching

A technique where LLM providers store and reuse processed prompt prefixes to reduce both latency and costs for repeated or similar API requests.

Model Routing

The practice of dynamically directing AI requests to different language models based on task complexity, cost requirements, and quality needs to optimize spending.

Token Budgets

Predetermined limits on the number of tokens an AI agent, user, or feature can consume within a given time period, used to prevent cost overruns.

AI Agent Fleet

A collection of multiple AI agents working together or independently within an organization, requiring coordinated management, monitoring, and cost tracking.

Cost Attribution

The practice of assigning AI infrastructure costs to specific teams, features, customers, or business units to understand unit economics and drive accountability.

AI Spend Management

The comprehensive discipline of planning, tracking, optimizing, and governing AI-related expenditures across an organization.

Inference Costs

The computational expenses incurred when running trained AI models to generate predictions or outputs, including GPU compute, token processing, and API charges.

Ready to Optimize Your AI Costs?

ClawHQ gives you complete visibility into every term on this page — token costs, model routing, agent budgets, and more.

Get Started Free →