Glossary

AI Cost Glossary

Everything you need to know about AI agent costs, token pricing, and cost optimization — explained clearly.

AI Agent Costs

The total expenses incurred when running autonomous AI agents, including token consumption, API calls, compute resources, and orchestration overhead.

Read more →

Token Pricing

The pricing model used by LLM providers where costs are calculated based on the number of tokens (text fragments) processed in API requests and responses.

Read more →

LLM API Costs

The expenses incurred when using large language model APIs, including token charges, rate limit considerations, and infrastructure costs for integrating LLM capabilities.

Read more →

Cost Per Token

The unit price charged by LLM providers for each token processed, typically measured in dollars per million tokens for both input and output.

Read more →

OpenAI Pricing

The pricing structure for OpenAI's API products including GPT-4o, GPT-4o-mini, o1, DALL-E, and other models, based on per-token and per-request charges.

Read more →

Anthropic Pricing

The pricing structure for Anthropic's Claude API models including Claude 3.5 Sonnet, Claude 3.5 Haiku, and Claude 3 Opus, based on per-token charges.

Read more →

AI Observability

The practice of monitoring, tracing, and understanding the behavior, performance, and costs of AI systems in production through logs, metrics, and traces.

Read more →

AI Cost Optimization

Strategies and techniques for reducing AI infrastructure and API spending while maintaining or improving output quality, including model routing, caching, and prompt engineering.

Read more →

Prompt Caching

A technique where LLM providers store and reuse processed prompt prefixes to reduce both latency and costs for repeated or similar API requests.

Read more →

Model Routing

The practice of dynamically directing AI requests to different language models based on task complexity, cost requirements, and quality needs to optimize spending.

Read more →

Token Budgets

Predetermined limits on the number of tokens an AI agent, user, or feature can consume within a given time period, used to prevent cost overruns.

Read more →

AI Agent Fleet

A collection of multiple AI agents working together or independently within an organization, requiring coordinated management, monitoring, and cost tracking.

Read more →

Cost Attribution

The practice of assigning AI infrastructure costs to specific teams, features, customers, or business units to understand unit economics and drive accountability.

Read more →

AI Spend Management

The comprehensive discipline of planning, tracking, optimizing, and governing AI-related expenditures across an organization.

Read more →

Inference Costs

The computational expenses incurred when running trained AI models to generate predictions or outputs, including GPU compute, token processing, and API charges.

Read more →

Ready to Optimize Your AI Costs?

ClawHQ gives you complete visibility into every term on this page — token costs, model routing, agent budgets, and more.

Get Started Free →