Why Anthropic’s Agent SDK Is the Real Deal

There’s been a lot of hype and confusion around agents lately, and I was always curious why, since agents have been around for a long time. After digging in, the answer is pretty clear: Anthropic released the Agent SDK, and it’s genuinely different from what came before.

Even though frameworks like Vercel’s AI SDK exist, the Agent SDK stands apart because it gives you the same infrastructure, tooling, and architecture that powers Claude Code. You can build a Claude Code-caliber agent for any use case, not just coding. It’s essentially Claude Code with an API, which is game-changing given that Claude Code is one of the best agents on the market.

That means you can now spin up a Claude Code-level agent in under an hour. That same work would previously have taken weeks (if not months) and a full engineering team to get right.

Why It’s Blowing Up Now

The Agent SDK has actually been around for a bit. But adoption really picked up recently, largely because OpenClaw blew up and showed people what’s possible. That created a wave of awareness.

Another unfair advantage is that if you’re paying for a Claude Code subscription, you can use your subscription’s tokens and usage for Agent SDK development. You don’t have to pay for separate API tokens during testing and development. This is huge because Anthropic is heavily subsidizing those subscriptions right now.

The big caveat is that if you plan to release your agent for other people to use, you must switch to API-based pricing. The subscription loophole is only for your own testing and development.

What We Built & Learned

We built a HIPAA compliance auditor agent using the Agent SDK, and it’s roughly 10x more powerful than what we previously cobbled together running our own agent loop through Vercel’s AI SDK. The difference in capability out of the box is striking.

But it’s expensive. Generating a single compliance report can cost up to $50 in API usage. In context, though, it’s totally worth it. We’re saving thousands of dollars in engineering time that would’ve gone into producing that report manually.

The Cost Problem & Where This Is Headed

This is the biggest con of the Agent SDK for production use. API pricing makes it nearly impossible for consumer applications. If you wanted to build a personal agent for consumers, you’d likely need to charge $100–200/month to make it sustainable.

When we built personal agents for ourselves, we were burning through $100–300/month in API usage (thankfully covered by our Anthropic subscriptions; but if we were actually paying, it’d be hard to justify for personal use).

Consumer Agents vs B2B (Vertical Agents)

Here’s where we think this lands:

The sweet spot is B2B applications like our HIPAA checker, where $50 per report is a no-brainer compared to the alternative.
The future is hyper-specific, custom agents for businesses rather than generic one-size-fits-all agents. Companies will build their own boutique solutions that perform better at their specific tasks.

Model Updates: Codex 5.3 vs. Opus 4.6

We got hands-on with both Codex 5.3 and Opus 4.6 this week. Here’s the consensus from the engineering team:

Codex 5.3 (Extra High) is the clear winner for most tasks. It’s expensive, but it really shines at thinking through edge cases and complex scenarios across different codebases. It outperformed Opus 4.6 by roughly 20% across the various tasks we tested.
Opus 4.6 is still an extremely good model, just not quite at the same level for our use cases.
Opus 4.6 Fast also got a full test run. The extra speed and the 1 million token context window were genuinely helpful, but they come at an extreme cost. Running it through Cursor, we racked up over $100 in API usage after just 2–3 messages. We’re struggling to figure out the right use case at that price point. Great that it exists as an option, but not practical for regular use.

Overall, Codex 5.3 Extra High remains our go-to, especially for complex work with lots of edge cases to reason through.