The Rise of Tools

Aug. 11, 2025

This is a quick post to note some of my thoughts about the evolution of generative AI. With the release of GPT-5 last week, some of the online conversation reflects disappointment in the progress of the foundational model itself (we have been waiting for GPT-5 since last Fall, right?). This has led to the observation by many that we are going to see a shift from improvements to the foundational models themselves, to the growth and impact of tools.

Anthropic only published the MCP spec last Fall. Now, nine months later, I’m really seeing examples everywhere, with measurable benefits for important use-cases. While I’ve seen LibreChat sometimes hiccup in invoking tools, in general AI models are getting smarter at knowing when to invoke them. (It’s easy to build models for any purpose you can imagine - for example this simple MCP search tool I built to use with LM Studio to add search to last week’s other Open AI release, gpt-oss

The most compelling recent example I’ve seen illustrating the power of tools is from Jo Van Eyck. I will share a pairing of two of his videos that, in combination, really illustrate the power of tools to augment the base AI agent. In the first video, Jo demos refactoring, using recipes from Martin Fowler’s classic, Refactoring. With timer running, Jo first codes the refactoring himself, before starting from scratch a second time completely relying on Claude code. It’s well worth watching and I won’t spoil it except to note that in this first video, the speed gain using AI is not a major improvement for him.

When I tuned back in this weekend and caught up with his latest video– a second one, in which he demos what happens when you give some purpose designed MCP tools for Claude Code to use. The results are impressive. Jo connected Serena, a slick tool that enables semantic code analysis across a codebase, along with another tool for large codebases in .Net that acts as a specific refactoring engine. The token savings versus using Claude code solo was impressive- not to mention the overall improvements in speed and effectiveness.

Of course- the rise of these types of tools that end users can give to AI agents or foundation models parallel the work inside AI companies on the tooling and techniques that scaffold and augment the LLMs –the next token prediction engines– so they function more usefully.

More on the architectue of modern foundational models in a future post.

Comments