What Running AI Agents Actually Costs in 2026
Anthropic charges $0.08 per session-hour. OpenAI gives the runtime away. Compare the four AI agent pricing models and find the one that fits your scale.
Anthropic shipped Managed Agents on April 8 with a pricing twist no AI buyer had budgeted for. $0.08 per session-hour, on top of standard token costs. A brand-new line item on the invoice. The meter runs to the millisecond while the agent is actively working and pauses for free while it waits, but it’s still a runtime column most procurement teams haven’t budgeted for.
OpenAI’s Agents SDK charges nothing for the runtime. Zero platform fee. You pay tokens. You pay your own sandbox compute on whichever provider you pick. That’s it.
DeepSeek V4-Pro outputs at $3.48 per million tokens versus GPT-5.5 at $30. Same workflow, 8.6x cheaper on the model layer alone. Most teams are picking platforms on benchmark scores. The runtime economics are doing the deciding.
If you run AI budget at your company, this is the year platform selection becomes a finance decision more than a technical one. The four major vendors picked four different monetization models in the first half of 2026, and the spread between the most expensive and cheapest path for the same agent workflow now sits around 20x.
Quick Verdict
| Platform | Runtime Fee | Token Cost | The Hidden Number |
|---|---|---|---|
| Anthropic Managed Agents | $0.08 / session-hour (idle is free, billed to ms) | Claude list rates | New runtime line item — typically 10–15% of session invoice |
| OpenAI Agents SDK | $0 (open-source) | $5 in / $30 out per M (GPT-5.5) | Your sandbox compute, your bill |
| OpenAI Agents SDK + Batch | $0 (open-source) | ~50% of list (GPT-5.4 batch) | 24-hour delivery window |
| Google Gemini on Vertex AI | Bundled in GCP commit | Gemini Flash/Pro rates | Routed against committed-use discount |
| Microsoft Copilot Studio | Per-message licensing (Copilot Credits) | Bundled in Copilot | Locked to Azure, locked to GPT |
| DeepSeek V4-Pro + open runtime | $0 | $0.40 in / $3.48 out per M | You build the governance rails |
Same agent workflow. Five-tool chain. Roughly an hour of continuous run. Cheapest path lands around $0.05 per agent-hour. Most expensive lands closer to $1. That gap is where the ROI conversation actually happens.
Anthropic Just Put a Meter on the Runtime
The session-hour billing is the part most people skipped past. Standard token pricing is what enterprise procurement teams already know how to budget. A new per-hour meter on the runtime itself is something else.
A “session” in Anthropic’s pricing model is a continuous agent run between context resets. Open the session, and the $0.08/hour clock starts. Reset the context (because the agent hit token limits, or a tool returned a payload too large to keep in working memory, or the workflow needed to start a sub-task with a clean state), and a new session opens. New meter.
Here’s the operational nuance worth pricing in correctly. Anthropic bills runtime to the millisecond, and idle time is free. Time spent waiting for user input, queued waiting on a tool confirmation, or terminated does not accrue. Only sessions in “running” status count. So the meter is fairer than a flat hourly clock.
That makes the headline number small per workflow. In Anthropic’s own worked example, a one-hour Claude Opus session burning 50K input and 15K output tokens costs about $0.71 total, with the $0.08 runtime fee landing around 11% of the bill. Tokens dominate. The structural change isn’t the per-hour price. It’s that procurement teams now have a runtime column to forecast separately, and runtime cost scales with concurrent sessions in a way token cost doesn’t.
I’m not arguing the session-hour model is wrong. The runtime work Anthropic is doing (state management, tool sandboxing, identity propagation, retry logic) is real infrastructure. Someone has to pay for it. The argument I am making: most procurement teams budgeted Claude as a token cost. The session-hour line is a category they don’t have a column for yet.
The OpenAI Counter-Move That Flew Under the Radar
OpenAI shipped the Agents SDK as open-source. No platform fee. You install it, you wire it to your model provider of choice, you provide your own sandbox compute. The runtime layer is free.
That sounds like a flat win for buyers. It isn’t, quite. Here’s the trade.
OpenAI just doubled list pricing on GPT-5.5 to $5 in / $30 out per million tokens. I broke down why that move worked in the OpenAI doubles the price piece. The short version: OpenAI is capturing margin on the model layer because their hardware costs collapsed and the price didn’t move down with them. The free runtime makes more sense once you read it as part of that strategy. They don’t need to meter the runtime. They’re already capturing the value on the tokens.
So the OpenAI shape is: free runtime, expensive tokens, and you bring your own sandbox compute (which on AWS or GCP runs $0.05-0.20/hour for a small agent VM, depending on what tools the agent calls). For a five-tool workflow on GPT-5.5, you’re looking at $0.45-0.90 per agent-hour, mostly tokens.
If you swap to GPT-5.4 through OpenAI’s Batch API for latency-tolerant work, the same hour drops to $0.10-0.18. Same OpenAI runtime. Same tools. Half-price tokens. The platform layer didn’t change. Only the model tier and the latency profile changed.
That flexibility is what you’re actually paying for when you pick OpenAI’s stack right now: the ability to swap tiers inside the same SDK without rebuilding the agent.
Why DeepSeek Changes the Math Anyway
DeepSeek’s V4-Pro release moved the floor on token economics for any workload that doesn’t strictly require frontier-class capability. Output tokens at $3.48 per million versus GPT-5.5 at $30. That’s an 8.6x gap on the model layer alone. Input tokens widen the gap further, depending on context size.
Pair DeepSeek with an open agent runtime (LangGraph, AutoGen, or the open-source pieces of OpenAI’s own SDK pointed at a non-OpenAI provider) and you have a stack with zero runtime fees and the cheapest tokens in production. The total cost for the same five-tool agent-hour lands around $0.04-0.09. Roughly 1/5th the OpenAI Agents SDK + GPT-5.5 path. Cheaper still than the Anthropic Managed Agents path once Claude token rates are in the mix.
The honest part: DeepSeek isn’t the right answer for every workflow. Capability gaps still exist for advanced reasoning chains, long-context coding tasks, and certain agentic patterns where tool-calling reliability under load matters. I covered the trade-offs in the DeepSeek V4 piece. For high-volume, low-stakes work (document classification, data enrichment, first-pass summarization, internal search), it wins on cost without losing on outcomes that matter.
How Much Does It Actually Cost to Run an AI Agent for One Hour in 2026?
A typical five-tool agent workflow running for one continuous hour costs:
- DeepSeek V4-Pro + open runtime: $0.04-0.09/hour (cheapest tokens, zero platform fee)
- OpenAI Agents SDK with GPT-5.4 via Batch: $0.10-0.18/hour (half-price tokens, no runtime fee, 24-hour delivery)
- Microsoft Copilot Studio: $0.10-0.30/hour (Copilot Credits per response, depends on message volume)
- Anthropic Managed Agents: $0.50-0.90/hour (Claude tokens dominate; $0.08 runtime adds ~10-15%)
- OpenAI Agents SDK with GPT-5.5: $0.45-0.90/hour (tokens only, no runtime fee, premium model rates)
Numbers vary by tool count, context size, and task complexity. The takeaway is the spread, not the precision: 4-20x between cheapest and most expensive for the same job.
The Anthropic Enterprise Unbundle Nobody Read Past the Headline
The session-hour pricing wasn’t the only structural change Anthropic made this quarter. The company also unbundled Claude, Claude Code, and Cowork from flat-fee enterprise pricing for its largest customers. Those three products now bill on per-token usage rather than capped seats.
Read that the way procurement reads it. Flat-fee enterprise contracts are how the heaviest customers get the best per-unit economics. Anthropic just took that lever away from its biggest accounts. The largest enterprise users of Claude Code (the ones running it across hundreds of engineers, generating tens of millions of tokens daily) just got moved to a metered model. The customers who were saving the most on flat-fee pricing are the ones whose invoices changed the most.
This is the same pattern as managed agents. Meter the runtime. Meter the tokens. Capture margin from heavy users instead of subsidizing them.
I wrote about the broader vendor incentive shape in the AI stack expiration date piece. The pattern keeps repeating. Every vendor with a public IPO path or a big balance sheet defense to make is moving the same direction: meter more, bundle less, price by usage. If your AI procurement strategy assumed flat-fee predictability, you’ve got six to twelve months to rebuild it.
What Four Monetization Models Tell You
Stack the four major platforms side by side and the strategy each vendor is running becomes obvious.
Anthropic: meter the runtime, meter the tokens. Layered margin capture. The session-hour fee is small per-unit but compounds with reset frequency. The token rates are premium. The unbundle moved the largest accounts onto metered pricing. This is the most aggressive monetization stance of the four, and it’s the model with the strongest enterprise traction, which I covered in the Anthropic out-earns OpenAI piece. The vendor with the best demand can charge for the most line items.
OpenAI: free runtime, premium tokens. All margin capture sits on the model layer. The Agents SDK is the loss leader that makes GPT-5.5 token spend feel like the only line item. It’s a good story for buyers if (and only if) they actually use the SDK’s portability to route work to cheaper tiers when they can.
Google: bundle it into your cloud commit. Gemini agents on Vertex AI fold into GCP committed-use discounts. The pitch is: you’re already paying us for cloud, the agents are part of that envelope. This is the easiest sell for Google-heavy shops and a non-starter for everyone else. Same pattern I covered in the Google $40B Anthropic piece but applied to Google’s first-party agent stack.
Microsoft: per-message licensing, locked to Azure. Copilot Studio agents bill by message volume. The economics work if your usage is predictable and your team is already deep on Microsoft licensing. They get ugly fast if usage scales unpredictably or if you wanted to route any of the work outside Azure.
Four vendors. Four monetization shapes. Same fundamental product (an agent that takes actions on your behalf). The one buyers should care about: the shape determines whether your bill scales linearly with usage or non-linearly. Anthropic’s session-hour layer is the one that scales most non-linearly with workflow complexity. OpenAI’s pure-token model is the most predictable. Microsoft’s is the most opaque.
Your Move This Week
Three concrete actions. All doable by Friday.
- Audit your current agent runtime spending by line item. Pull last 30 days of invoices from every AI vendor. Tag each line as runtime fee, token cost, or compute. If you’re on Anthropic Managed Agents and haven’t separated session-hour from token costs in your reporting, you’re flying blind on which line is growing fastest. The session-hour line is the one to watch month over month.
- Compute cost per successful agent task on your current stack. Take 100 real production runs. Total invoice cost across all line items. Divide by successful runs. This is your real unit economic. Most teams are tracking cost per token, which is the wrong denominator. The right one is cost per successful business outcome. I argued for this framing in the AI ROI measurement piece.
- Run the same workflow on a competing platform for one week. Pick the highest-volume agent workflow you have today. Prototype the same task on a second platform. Don’t migrate. Just measure. If the cost-per-successful-task gap is more than 2x, you have an open conversation with your current vendor’s account manager. If the gap is more than 5x, you have a migration project to justify in next quarter’s budget.
The headlines this month read “Anthropic launches Managed Agents” and “OpenAI Agents SDK goes open-source.” The story underneath is that the four major AI vendors just declared four different views on what an agent runtime should cost, and the customers who treat platform selection as a model-quality decision are the ones who’ll be surprised by their Q3 invoice.
The customers who treat it as a finance decision (with the runtime model, the token model, and the bundle structure all on the same spreadsheet) are the ones running agents at unit economics that make the ROI math work.
Do the audit this week. The pricing models won’t simplify from here. They’ll keep splitting.
Related Reading:
TAGS
Ready to Take Action?
Whether you're building AI skills or deploying AI systems, let's start your transformation today.
Related Articles
Anthropic Goes Public. Lock In Your Contracts Now.
Anthropic filed a confidential S-1 targeting a $965B October IPO. See the pre-IPO contract moves enterprise buyers must make before Wall Street takes over.
Your AI Coding Budget Is About to Break
Microsoft canceled Claude Code. Uber burned its 2026 AI budget in 4 months. Compare the flat-rate vs usage-based math every engineering org now faces.
Your Software Vendors Are Running AI on Your Data
DataGrail's June 1 report: 63.6% of SaaS vendors run AI subprocessors without telling you. See the contract clauses to add before your next renewal.