Skip to content
2 min read

Tools, not prompts: how I build AI agents that don't fall over

The reliability of an AI agent comes from tool design, not prompt cleverness. Here's the pattern I use with Claude to ship agents into production.

Most “AI agent” demos work beautifully on stage and collapse in production. The reason is almost always the same: the team poured their effort into one enormous prompt and hoped the model would behave. It won’t — not consistently, not at scale.

The fix isn’t a better prompt. It’s better tools.

What “tool-based” actually means

When I build an agent on Anthropic Claude, every capability is a small, typed tool with its own input schema: create_campaign, fetch_analytics, book_appointment. The model’s job shrinks to choosing which tool to call and with what arguments. Your code does the work, and your validation runs before anything happens.

This sounds like a minor architectural choice. It’s the whole game.

Why it works

Three things fall out of the tool-based design for free:

MCP makes it scale

The Model Context Protocol (MCP) takes this one step further: it standardizes how tools are exposed to a model. Instead of bespoke glue for every integration, you expose your systems through one consistent interface. Adding the tenth tool is as clean as the first, and access stays auditable.

This is how Scalify’s in-app agent serves 23,000+ users: not a clever prompt, but a set of well-bounded tools the model can compose.

The takeaway

If you’re building an AI feature and you find yourself editing a 600-line prompt to fix behavior, stop. The behavior you want is probably a tool you haven’t defined yet. Design the tools, validate their inputs, gate the dangerous ones, and let the model do the one thing it’s genuinely good at: choosing.

All posts