Launch offer50% off up to $5,000, then 3% for lifeSee offer →

GUIDES · 18 MIN READ

Grok API pricing in 2026: models, token math, and how to cut your bill

Live xAI list prices for Grok 4.3 and Grok Build 0.1, how the 2x output multiplier changes real workloads, cached-input savings, and a practical budget template for production apps.

RK

Ravi Kumar

Co-founder ·

Grok API pricing in 2026: models, token math, and how to cut your bill

Last updated: June 24, 2026. Original publish: June 24, 2026. We review xAI list prices and model slugs quarterly and after major API announcements.

If you are budgeting a production app on the Grok API in 2026, the headline numbers look almost boring: $1.25 per million input tokens and $2.50 per million output tokens on the current Grok 4.3 family (xAI Docs, 2026). That is until you run the math on a real workload with long system prompts, tool schemas, streaming replies, and the occasional 500k-token context dump. Then the bill stops looking flat.

Developers migrating from OpenAI, Anthropic, or older Grok slugs are hitting the same three surprises: retired model names that still appear in old tutorials, an output multiplier that is much lower than most frontier APIs, and a cached-input tier that can change unit economics overnight if your architecture supports it.

In this guide you will discover:

  • The live Grok API price table for June 2026 (text, voice, and Imagine are separate products)
  • How to estimate monthly spend from tokens, not vibes
  • Where Grok is cheaper than GPT-class models for output-heavy workloads
  • A 48-hour plan to point your existing OpenAI SDK at Grok (or a discounted proxy)
  • Fifteen FAQ answers you can paste into internal docs

This article is based on public xAI documentation, independent pricing trackers, and our experience metering Grok traffic through Grokified for prepaid developer accounts. Where we show bill examples labeled illustrative, they use published list rates unless noted otherwise.

Grok API pricing concepts diagram

What does the Grok API cost in 2026?

Quick answer: As of June 2026, xAI charges $1.25/M input tokens and $2.50/M output tokens on Grok 4.3 and the Grok 4.20 family, with $0.20/M cached input when prompts repeat. Grok Build 0.1 for coding workloads is $1.00/M in and $2.00/M out.

Context: why pricing changed in mid-2026

xAI retired several legacy text slugs on May 15, 2026, including older Grok 3 and Grok 4 identifiers that many blog posts still reference (AI Cost Check, 2026). Requests to retired slugs redirect to current models, but your cost model should not assume historical $3/$15 price points from 2025-era guides.

The current lineup centers on Grok 4.3 as the default general model and Grok Build 0.1 as the coding-focused variant (DevTk.AI, 2026). Voice (Realtime, TTS, STT) and Grok Imagine (image/video generation) are priced separately on the xAI docs pricing page.

Evidence: published token rates (June 2026)

ModelInput / 1MCached input / 1MOutput / 1MContextBest fit
Grok 4.3$1.25$0.20$2.501MChat, agents, multimodal text
Grok 4.20 (reasoning / non-reasoning)$1.25$0.20$2.501MPinned snapshots, reproducibility
Grok 4.20 Multi-Agent$1.25$0.20$2.502MLong-horizon agent loops
Grok Build 0.1$1.00$0.20$2.00256KCodegen, repo tools, SWE agents

Sources: xAI Models & Pricing, TokenRate Grok pricing guide, DevTk.AI pricing breakdown.

The standout structural detail is the 2x output multiplier (TokenRate, 2026). Many providers charge 4x to 6x more for output than input. Grok charges 2x. For workloads where the model writes far more than the user sends (report generation, code scaffolding, JSON-heavy tool replies), effective cost per "useful unit of work" can land below nominally similar rivals.

Application: a back-of-napkin monthly estimate

Use this formula:

monthly_cost =
  (input_tokens / 1_000_000) * input_rate
+ (cached_input_tokens / 1_000_000) * 0.20
+ (output_tokens / 1_000_000) * output_rate

Illustrative example (Grok 4.3, no cache): 10M input + 5M output per month → $(10 × 1.25) + (5 × 2.50) = $25.00/mo at list (DevTk.AI monthly table, 2026).

Illustrative example (heavy output): 2M input + 8M output → $(2 × 1.25) + (8 × 2.50) = $22.50/mo. The same output-heavy shape on a 5x output multiplier stack would often be multiples higher.

Common pitfalls

  • Budgeting on consumer SuperGrok plans ($30/mo) instead of API token meters. Consumer subscriptions do not map 1:1 to production API spend.
  • Ignoring cached input when your system prompt, tool definitions, or RAG context repeat every request.
  • Using retired slugs in config and assuming old benchmark posts reflect your invoice.

Success metrics

  • Cost per successful task (not per request)
  • Cache hit rate on input tokens
  • Output tokens per resolved user goal
  • p95 latency at your chosen model tier

Grok 4.3 vs Grok Build 0.1: which model should you pay for?

Quick answer: Choose Grok 4.3 for general agents, long context, and multimodal text. Choose Grok Build 0.1 when the workload is mostly code generation, diffs, and repository-aware tooling inside a 256K window.

Context

Model selection is a pricing decision. Grok Build 0.1 is 20% cheaper on both input and output versus Grok 4.3, but the context window is smaller and the product positioning is coding-first (xAI Docs, 2026).

Evidence

Independent trackers place Grok Build 0.1 at $1/$2 per million tokens versus $1.25/$2.50 for Grok 4.3 (AI//COST, 2026). For SWE-bench class tasks, Grok 4.3 still trails top coding specialists on raw benchmark scores (AI//COST, 2026), but Build 0.1 is priced for volume codegen rather than frontier debugging.

Pick the model first, then optimize tokens. A cheaper model with sloppy prompts can cost more than a flagship model with tight context.

Application

WorkloadSuggested modelWhy
Customer support bot with 200-line system promptGrok 4.3 + cacheStable prefix, moderate output
Batch summarization of PDFsGrok 4.20 Multi-Agent2M context when truly needed
IDE autocomplete / patch generationGrok Build 0.1Lower rates, coding-tuned
Voice agentGrok Voice API (separate meter)Not billed like text tokens

Common pitfalls

  • Running repo-wide context through Build 0.1 when files exceed 256K tokens of effective context.
  • Using Grok 4.3 for tiny classification calls where a smaller model (if xAI exposes one for your use case) would suffice.

Success metrics

  • Pass rate on golden eval set per dollar
  • Tokens per merged PR (for codegen)
  • Escalation rate to human review

How cached input pricing works on Grok

Quick answer: Repeated input prefixes can bill at $0.20 per million tokens instead of $1.25, an 84% reduction on the cached portion (AI//COST, 2026).

Context

Prompt caching is not a Grokified-specific feature. xAI publishes cached input rates on the same models page as standard input (xAI Docs, 2026). Caching helps when many requests share an identical leading prefix: system instructions, tool JSON schemas, RAG chunks that do not change between turns, or a static compliance block.

Evidence

According to AI//COST (2026), cached input at $0.20/M versus $1.25/M base is the largest realistic discount xAI advertises for text models. Batch and flex tiers common on other providers are not published the same way for Grok as of June 2026.

Illustrative scenario: Suppose 70% of your input tokens are cache-eligible on Grok 4.3. Effective blended input rate ≈ $(0.30 × 1.25) + (0.70 × 0.20) = $0.515/M before output charges.

Application

  1. Stabilize system prompts (version them, avoid timestamps in prefix).
  2. Keep tool definitions static within a deployment.
  3. Measure cache-eligible tokens in logs (xAI exposes usage fields on completions; mirror them in your proxy metrics).

Common pitfalls

  • Mutating the first token of a prefix on every request (dynamic "today is …" in the system block kills cache hits).
  • Assuming cache hits without telemetry.

Success metrics

  • % input tokens billed at cached rate
  • Cost per conversation after caching changes ship

Grok API vs OpenAI and Claude: a fair cost comparison

Quick answer: Grok 4.3 sits in the mid-tier on input price but can win on output-heavy jobs because output is only 2x input, not 5x to 6x (TokenRate, 2026).

Context

Headline token prices mislead buyers. Compare the shape of your traffic: support bots skew output-heavy; embedding-heavy pipelines skew input-heavy; agent loops add tool JSON to input on every step.

Evidence

TokenRate (2026) notes Grok 4.3 at $1.25/$2.50 sits between faster/cheaper flash models and frontier GPT-class pricing on input, while the low output multiplier changes total cost when completions are long. AI//COST (2026) positions Grok as cheaper per-token than several frontier Anthropic and OpenAI SKUs at list, with tradeoffs in ecosystem surface area (fewer cloud marketplaces resell Grok today).

Provider (illustrative frontier tier, June 2026)Input / 1MOutput / 1MOutput multiplier
Grok 4.3 (xAI)$1.25$2.502x
GPT-5-class (public trackers)HigherMuch higher~5x typical
Claude Opus-class (public trackers)HigherMuch higher~5x typical

Use this table directionally and re-check list prices before signing a finance approval. Third-party trackers aggregate public docs; your enterprise agreement may differ.

Application

Build a traffic-shaped spreadsheet:

  1. Export 7 days of prompt/completion tokens from staging.
  2. Split input into cache-eligible vs fresh.
  3. Model three vendors with the same token counts.
  4. Add integration cost (SDK swap is cheap; compliance review may not be).

Common pitfalls

  • Comparing Grok to a discounted OpenAI enterprise deal using list prices only.
  • Ignoring reliability and rate-limit behavior (price per successful request matters).

Success metrics

  • Total cost at fixed quality bar on eval set
  • Cost per 1,000 successful API calls

Voice, Imagine, and other Grok APIs (separate meters)

Quick answer: Text token prices in this guide do not apply to Grok Voice or Grok Imagine. Voice Realtime is $0.05/minute; Imagine images start around $0.02–$0.05 per image depending on tier (xAI Docs, 2026).

Context

xAI splits modalities. A team building a voice support bot must budget Realtime minutes, TTS characters, and STT hours separately (AI Cost Check, 2026).

Evidence

Published Voice API rates (xAI Docs, 2026):

ProductPrice
Realtime voice$0.05/min ($3.00/hr)
Text to speech$15.00 / 1M characters
Speech to text (REST)$0.10/hr
Speech to text (streaming)$0.20/hr

Imagine pricing (xAI Docs, 2026) includes per-image and per-second video generation tiers.

Application

Maintain separate cost centers in FinOps dashboards for text vs voice vs media. Do not allocate voice minutes into LLM token budgets.

Common pitfalls

  • Prototype with Realtime voice, ship to finance with text-only estimates.

Success metrics

  • Cost per minute of successful voice session
  • Asset generation cost per user storyboard

Paying less than xAI list for the same Grok models

Quick answer: Negotiate committed use with xAI if you are at enterprise scale, maximize cached input, right-size models, and compare OpenAI-compatible proxies that pass through Grok with volume economics (like prepaid credit with introductory discounts).

Context

Most startups will not get a private rate card on day one. Practical savings come from architecture and routing.

Evidence

Grokified publishes a 50% discount off xAI list on early prepaid usage up to $5,000 saved, then 3% off for life (Grokified pricing page, 2026). We wrote separately about why that subsidy is bounded so it stays sustainable.

Illustrative case (labeled): A team spending $800/mo at xAI list on Grok 4.3 text might pay $400/mo through an introductory 50% proxy rate until the savings cap, then ~$776/mo at 3% off list afterward, excluding cache optimizations.

Application

  1. Measure list-equivalent spend for 14 days.
  2. Test cache-friendly prompt layout.
  3. Run parallel traffic on Grok Build 0.1 for codegen paths.
  4. If using a proxy, confirm model slug parity, streaming, and tool-call compatibility (not all "OpenAI-compatible" gateways are equal).

Common pitfalls

  • Chasing discount percentages while ignoring latency or error-rate regressions.
  • Prepaying more credit than your 90-day forecast supports.

Success metrics

  • Effective $/M tokens after discounts and cache
  • Error rate vs direct xAI endpoint

Grok API pricing statistics visualization

How to point your OpenAI SDK at Grok in 48 hours

Follow these steps to migrate a standard OpenAI SDK integration to Grok without rewriting your application layer.

  1. Create an xAI or proxy API key Sign up at the xAI console or a compatible proxy such as Grokified. Store the key in your secrets manager, not in source control.

  2. Change base URL and model slug Point the OpenAI client at https://api.x.ai/v1 for direct xAI access, or your proxy base URL (Grokified uses https://api.grokified.com/v1). Set the model to grok-4.3 or grok-build-0.1.

  3. Run a golden-request diff test Replay 50 production prompts in staging. Compare latency, tool-call JSON, and token usage fields. Retired slugs should be updated in config.

  4. Ship a canary route Send 5% of traffic to Grok for 24 hours. Watch cost per success, not just token totals.

  5. Enable observability Log prompt_tokens, completion_tokens, and cached input if exposed. Build a weekly cost email for finance.

Example (Python, direct xAI):

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["XAI_API_KEY"],
    base_url="https://api.x.ai/v1",
)

completion = client.chat.completions.create(
    model="grok-4.3",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Summarize Grok API pricing in 3 bullets."},
    ],
)
print(completion.usage)

Example (Grokified proxy, same SDK):

client = OpenAI(
    api_key=os.environ["GROKIFIED_API_KEY"],
    base_url="https://api.grokified.com/v1",
)

Monthly Grok API budget template (copy for finance)

Daily tokens (input / output)Grok 4.3 @ listGrok Build 0.1 @ list
100K / 50K~$7.50/mo~$6.00/mo
1M / 500K~$75/mo~$60/mo
10M / 5M~$750/mo~$600/mo

Source pattern: DevTk.AI (2026) monthly examples, rounded. Add 20–40% headroom for spikes during launches.

ROI timeline (illustrative)

PhaseWeekActivityExpected outcome
Discovery1Token auditBaseline $/task
Migration2SDK swap + canary<1% error delta
Optimization3–4Cache + model tiering15–40% input savings
Steady state5+Weekly reviewPredictable invoice

Case studies (illustrative scenarios)

Scenario A: B2B support bot

Problem: 40 agents, average 3,200 input tokens (large FAQ prefix) and 450 output tokens per ticket.
Change: Moved static FAQ to cache-eligible system prefix; switched from a 5x output multiplier model to Grok 4.3.
Outcome (illustrative): 28% lower weekly inference spend with comparable CSAT on a 500-ticket eval set.
Lesson: Output multiplier dominates when replies stay short.

Scenario B: Code review bot

Problem: Single-repo reviews sending 180K tokens of context per run on a general model.
Change: Grok Build 0.1 for diff summaries; Grok 4.3 only for security-sensitive files.
Outcome (illustrative): 35% token cost reduction; median latency improved due to smaller default completions.
Lesson: Match model tier to task risk, not one model for everything.

Editorial methodology and disclosures

We verify list prices against xAI documentation and at least two independent trackers (TokenRate, DevTk.AI, AI Cost Check) before each update. Illustrative scenarios use published rates and stated assumptions; they are not guarantees of your results.

Conflict disclosure: Grokified resells Grok API capacity with introductory discounts. We still show xAI list math so you can compare fairly. DEEPORAX AI LTD operates Grokified.

Annual review: We revisit this guide after xAI model launches, slug retirements, or material price changes.

Enterprise procurement checklist

Before you sign a vendor packet or file a PO, confirm these items with finance and security. They rarely change the per-token math, but they prevent surprise spend.

Commercial

  • Confirm billing currency (USD) and whether tax is charged on prepaid credit vs postpaid invoices
  • Ask if committed spend tiers exist for your volume (xAI enterprise sales may differ from public list)
  • Define who owns API key rotation and offboarding

Technical

  • Pin production model slugs and document the May 2026 retirement policy for legacy names
  • Require staging load tests that include streaming and tool calls, not just single-shot chat
  • Log token usage fields on every completion for showback to product teams

Risk

  • Document data handling for prompts sent to xAI or through a proxy
  • Set per-key spend alerts and hard caps where your provider supports them
  • Keep a rollback path to your previous model for 72 hours after cutover

According to DevTk.AI (2026), teams that skip token logging during migration routinely mis-forecast month-two spend because launch traffic differs from the audit week. Treat the first month as calibration, not the budget baseline.

Self-assessment: where are you on the Grok cost curve?

Answer honestly:

  1. Explorer: Prototyping in the xAI console, no token logs yet → Action: export usage, build the spreadsheet from this guide.
  2. Migrator: SDK wired, pre-production traffic → Action: run the 48-hour plan and golden diff tests.
  3. Optimizer: Production traffic, stable quality → Action: maximize cache hits and right-size Build 0.1 vs 4.3.
  4. FinOps-aware: Weekly cost reviews, alerts configured → Action: compare list vs proxy effective rates quarterly.

Most teams stall between Migrator and Optimizer because caching and model tiering are cross-team work. Schedule a single "prefix audit" sprint before asking for more budget.

References

Official

Industry trackers & analysis

Grokified

FAQ

Q: How much does the Grok API cost per token in 2026?

Quick answer: Grok 4.3 is $1.25 per million input tokens and $2.50 per million output tokens as of June 2026.

Detailed explanation: Cached input on eligible prefixes bills at $0.20/M. Grok Build 0.1 is $1.00/M in and $2.00/M out. Voice and Imagine use separate meters. See the tables in the "What does the Grok API cost" section and xAI Docs for authoritative numbers.

Q: Is Grok API cheaper than OpenAI?

Quick answer: It depends on your traffic shape; Grok is often competitive on output-heavy workloads because output is only 2x input.

Detailed explanation: Compare your measured input/output ratio and cache eligibility against your current vendor's list or contract rates. TokenRate (2026) explains the 2x multiplier advantage. Enterprise discounts may change the winner.

Q: What happened to Grok 3 and Grok 4 Fast pricing?

Quick answer: xAI retired several legacy slugs on May 15, 2026; they redirect to current models priced like Grok 4.3.

Detailed explanation: AI Cost Check (2026) documents the retirement. Update configs and cost models that still reference old identifiers.

Q: What is the cheapest Grok model for coding?

Quick answer: Grok Build 0.1 at $1/$2 per million tokens with a 256K context window.

Detailed explanation: Use Grok 4.3 when you need 1M context or general reasoning on non-code tasks. See the model selection table above.

Q: Does Grok charge for prompt caching?

Quick answer: Cached input is billed at $0.20/M, not the full input rate, when prefixes qualify.

Detailed explanation: Architecture matters more than checkout. Stable system prompts and tool schemas improve hit rate. Measure cached tokens in usage logs.

Q: How do I estimate my monthly Grok API bill?

Quick answer: Sum input, cached input, and output using the formula in this guide.

Detailed explanation: Export a week of staging traffic, apply rates from the June 2026 table, multiply by 4.3 for a conservative monthly forecast, then add headroom.

Q: Can I use the OpenAI Python SDK with Grok?

Quick answer: Yes. xAI exposes an OpenAI-compatible HTTPS API at https://api.x.ai/v1.

Detailed explanation: Change base_url and model. Proxies like Grokified use the same pattern with a different base URL and API key.

Q: What is Grok 4.20 Multi-Agent pricing?

Quick answer: Same $1.25/$2.50 text rates with a 2M context window on the Multi-Agent variant listed in xAI Docs (2026).

Detailed explanation: Choose it when agent orchestration genuinely needs longer context, not by default.

Q: Are Grok consumer plans the same as API pricing?

Quick answer: No. SuperGrok subscriptions are consumer products with different limits than token-metered API usage.

Detailed explanation: Budget production separately from personal chat subscriptions.

Q: How fast do Grok API prices change?

Quick answer: Major changes correlate with model launches or slug retirements; xAI announced May 2026 retirements explicitly.

Detailed explanation: Pin model versions for reproducibility and subscribe to xAI developer announcements.

Q: What rate limits apply to Grok API?

Quick answer: xAI publishes tier documentation separately from the models pricing page; limits vary by account tier.

Detailed explanation: Test canary traffic before full cutover. Rate-limit errors affect cost per successful request.

Q: How does Grokified pricing relate to xAI list?

Quick answer: Grokified bills prepaid credit against Grok usage with published introductory discounts off list.

Detailed explanation: See Grokified pricing and the economics post for cap and lifetime rate details. You still consume Grok models; billing flows through Grokified.

Q: Should I use direct xAI or a proxy?

Quick answer: Direct xAI is simplest for compliance-minimal tests; proxies can add volume discounts, unified billing, or extra routing.

Detailed explanation: Validate compatibility on streaming, tools, and error shapes before migrating production.

Q: How do I debug a sudden Grok bill spike?

Quick answer: Sort requests by output tokens and check for runaway agent loops or missing max_tokens caps.

Detailed explanation: Compare p95 completion length week-over-week. Spikes often trace to prompt changes, not rate changes.

Q: Where can I get a Grok API key?

Quick answer: Create one in the xAI console or sign up for a Grokified API key if you want prepaid credit with introductory pricing.

Detailed explanation: Store keys in a secrets manager and rotate on engineer offboarding.

pricinggrok-apixaicost-optimization
RK

Ravi Kumar

Co-founder, Grokified

Previously built billing infrastructure for two developer platforms. Writes about the unglamorous parts of running an API business.

Keep reading

All posts →