Someone Tested Claude Code 2,430 Times and Discovered Its Tool Selection Logic Isn't What You'd Expect

Claude Code Tool Selection Research

Have you ever wondered: when you ask AI to help write code, how does it choose which tools to use?

You say “add a login feature” — what authentication solution does it pick? You say “deploy to the cloud” — which cloud platform? You say “add state management” — Redux or Zustand?

Honestly, I used to think AI tool selection was basically random guessing, or based on whatever appeared most frequently in its training data. Then I saw this experiment from Amplifying.ai and realized it’s way more interesting than I imagined.

Here’s what they did: they took three Claude models (Sonnet 4.5, Opus 4.5, Opus 4.6), threw them into four different project repositories, ran each three times, for a total of 2,430 tests. The key was not telling it which tools to use — just asking open-ended questions and watching what it chose.

The results? Some choices matched my expectations, others completely surprised me.

Finding 1: Claude Code Builds Instead of Buys When It Can

This finding caught me off guard.

Out of 20 tool categories, in 12 of them Claude Code’s first choice wasn’t to use an existing tool — it was to write one from scratch.

Ask it to add feature flags, and instead of recommending LaunchDarkly, it writes a configuration system using environment variables with percentage-based rollout. Ask it to add authentication, and instead of Auth0, it writes a JWT + bcrypt solution.

The “Custom/DIY” label appeared 252 times — more than any individual tool.

When I thought about it, this actually makes sense. You’re asking something that knows how to write code to solve your problem. Its instinct is to solve it with code, not go shopping for third-party solutions. It’s like asking a carpenter “I want a bookshelf” — their first instinct is to build one from wood, not open Amazon and search “bookshelf.”

But this tendency isn’t absolute. In some areas, it’s remarkably stubborn — once it locks onto a tool, it sticks with it.

Finding 2: Decisive Tool Selection — GitHub Actions at 94%

When Claude Code decides to use an existing tool instead of building custom, its choices are highly concentrated:

CI/CD: GitHub Actions at 94% (152/162)
Payments: Stripe at 91% (64/70)
UI Components: shadcn/ui at 90% (64/71)
Frontend Deployment: Vercel at 100% (86/86, all JS projects)
State Management: Zustand at 65% (57/88)
Error Monitoring: Sentry at 63% (101/160)

In Claude Code’s “worldview,” certain tools have become default options. Ask it to set up CI/CD, and it almost never considers Jenkins, GitLab CI, or CircleCI — it goes straight to GitHub Actions.

This decisiveness reminds me of how New Yorkers have their go-to pizza spots, or how Texans have their preferred BBQ joints. It’s not that others are bad, these are just the defaults — you’d need a compelling reason to switch.

Finding 3: Newer Models Prefer Newer Tools, Drizzle ORM Replaces Prisma

The researchers compared choices across three models and discovered a clear “recency gradient”: newer models tend to choose newer tools.

The most striking example is ORM.

In JavaScript projects, Sonnet 4.5 chose Prisma 79% of the time, Drizzle 0%. But with Opus 4.6, the situation completely flipped: Drizzle 100%, Prisma 0%.

It’s like asking your dad and your younger brother the same question “what phone should I buy?” — dad says iPhone, brother says Pixel. Different generations have different standards for “what’s good.”

Python task queues show a similar pattern. Sonnet 4.5 chose Celery 100% of the time, but Opus 4.6 chose FastAPI BackgroundTasks 44%, with most of the rest being custom asyncio tasks — Celery was abandoned entirely.

Caching is even more dramatic. Sonnet 4.5 chose Redis 93% of the time, but Opus 4.6 dropped to 29% — half the time it opts to write a simple in-memory cache instead.

AI models’ “technical taste” evolves with their training data. Opus 4.6 has newer training data, exposed to more “lightweight” and “simple is enough” trends, so its choices lean in that direction.

Finding 4: Deployment Platform Choice Is Entirely Stack-Dependent — Vercel or Railway

The deployment category is particularly interesting because it’s completely determined by tech stack:

JavaScript Frontend Projects (Next.js / React SPA): Vercel, 100%. No other option was chosen as primary.

Python Backend Projects (FastAPI): Railway, 82%.

What shocked me most: AWS, GCP, Azure — the three major cloud providers — across 112 deployment-related responses, the number of times they were chosen as primary was: zero.

It’s like asking a group of foodies where to get ramen and nobody mentions the big chain restaurant. Not because it’s bad, but because everyone has new default options.

The researchers explained why Vercel gets picked: it’s built by the creators of Next.js, zero-config deployment, automatic preview environments, edge function support. Ask it to deploy a Next.js project, and the response typically looks like:

Vercel (Recommended) — Built by Next.js creators, zero-config deployment, automatic previews, edge functions. Run `vercel deploy`.
Netlify — Good alternative, generous free tier.
AWS Amplify — If you're already in the AWS ecosystem.

Notice the difference? Vercel gets install commands and detailed reasoning. AWS Amplify gets one sentence. That’s quite a gap in treatment.

Finding 5: Some Tools Get “Polite Mentions” But Never Recommendations

This part made me laugh out loud.

Some tools, Claude Code knows exist and will mention them in responses, but never recommends as a primary choice.

In the frontend deployment space:

Netlify mentioned 67 times, but always as “alternative”
Cloudflare Pages 30 times, same — alternative
GitHub Pages 26 times, still just an alternative

Even worse, some get mentioned but aren’t even alternatives:

AWS Amplify mentioned 24 times, 0 alternative recommendations
Firebase Hosting mentioned 7 times, 0 alternative recommendations

It’s like asking a friend for restaurant recommendations and they say “you could try that new sushi place, or the hotpot next door is good too, oh and there’s a McDonald’s nearby.” McDonald’s got mentioned, but you know that’s not a recommendation.

The worst off are the “completely invisible” tools: AWS EC2/ECS, Google Cloud, Azure, Heroku. These barely get mentioned at all, like they don’t exist.

What This Research Tells Us

After digesting this data, here are my takeaways:

First, AI coding assistants are developing “technical taste.” They’re not randomly picking tools — there’s a judgment logic at work. This logic might come from the latest technical trends in training data, or from some understanding of “simple and works well.”

Second, if you want to know “what’s popular now,” watching AI choices might be more valuable than reading annual developer surveys. Because AI training data is recent code, it reflects what developers are actually using right now.

Third, traditional cloud providers should probably be worried. When AI coding assistants start influencing technology selection and they almost never recommend AWS, GCP, or Azure, customer acquisition costs for these companies could keep rising.

Finally, this research reminds us: the tools AI picks aren’t necessarily the best fit for you. Its choices are based on its “worldview,” while your project might have different constraints. It recommends Vercel because it thinks it’s simple and good, but if your company has compliance requirements for private cloud, you’ll still need to choose something else.

Whether a tool is good ultimately depends on your situation. AI can give suggestions, but you’re the one making the final call.

FAQ

Q: What’s the data source for this research?

A: Amplifying.ai used three Claude models (Sonnet 4.5, Opus 4.5, Opus 4.6), tested across four different project repositories, running each model three times, collecting 2,430 total responses. 85.3% of responses yielded extractable tool selections.

Q: Why does Claude Code prefer building custom solutions over existing tools?

A: The researchers believe this is because AI is fundamentally a “code-writing assistant” — its instinct is to solve problems with code. For features like feature flags or simple authentication, building custom might cost less than integrating a third-party service, so it chooses to build.

Q: Is Vercel really that good?

A: For Next.js projects, Vercel does provide the best developer experience because it’s built by the Next.js team. Zero-config deployment, automatic preview environments, edge function support — these features make deployment incredibly simple. But if your project has special requirements (like mandatory private cloud deployment), it might not be the best choice.

Finding 1: Claude Code Builds Instead of Buys When It Can#

Finding 2: Decisive Tool Selection — GitHub Actions at 94%#

Finding 3: Newer Models Prefer Newer Tools, Drizzle ORM Replaces Prisma#

Finding 4: Deployment Platform Choice Is Entirely Stack-Dependent — Vercel or Railway#

Finding 5: Some Tools Get “Polite Mentions” But Never Recommendations#

What This Research Tells Us#

FAQ#