/ Insights/ Commentary/ SME/ Strategy

A model they decided not to ship.

Martin Lulham
The word MYTHOS set in heavy outlined capitals, with a thick red horizontal line running across the frame behind the letters.

The story most businesses missed this month went roughly like this.

In late March, a misconfigured Anthropic CMS quietly exposed a draft blog post, and inside that draft was a model nobody outside Anthropic had heard of: Claude Mythos. A step-change in capability, according to the draft, and strikingly capable at cybersecurity — generating working exploits for 181 Firefox vulnerabilities where Opus 4.6 had managed two. A couple of weeks later, on the 8th of April, Anthropic previewed Mythos officially.

And then they explicitly didn't release it.

There's a version of this story that's just about cybersecurity, and a version of it that's about Anthropic's safety policy, and a version of it that's about the US government reportedly negotiating access despite the default blacklist. Each of those is worth reading. The version that matters for anyone building a business around these vendors is different, and quieter, and nobody's writing about it.

It's this: a frontier lab has, for the first time visibly, decided a model was too dangerous for general release. Which means the capability ladder isn't just going up any more. It's being governed.

For most SMEs that's mostly good news. We do not, at M-Tech, want our clients' casework automatons shipping with the ability to find novel software exploits on a side-channel prompt. The idea that Anthropic is willing to hold capability back is genuinely reassuring, and we'd rather have it than the alternative.

But it introduces a shape of vendor-risk that didn't exist eighteen months ago, and it's worth naming.

Until now, the implicit assumption underneath almost every "let's build something with AI" conversation has been that the model sitting behind the API next year will be better than the one today, and the year after that better still. That assumption held for every release up to the middle of 2025. It was the basis of the shackles-are-off argument — the things that are too small or too fiddly to commission today will be table-stakes next year, because the cost curve and the capability curve both keep moving. And they have.

What Mythos quietly announces is that the capability curve now has a ceiling you don't control. A governed one, not a technical one. Capabilities will be released when the vendor decides they can be, or gated to specific programmes, or released to enterprise-tier customers only, or withheld from some geographies, or held back entirely. The next model — the one your roadmap is implicitly planning around — might arrive, or it might arrive stripped, or it might go to an allied government's workload and not to your law firm.

This is not a reason to panic. It is a reason to adjust.

The practical read is the same one we were already making on the zoo post — default to the middle tier of whichever family you're in, assume the middle tier will stay roughly the middle tier, and don't architect workflows that depend on a future capability you haven't yet seen. If a process only works because GPT-6 or Opus 5 or Gemini Ultra Next will be able to do something the current models can't, that process isn't a plan. It's a hope. Build the version of it that runs on what's shipped today. When the ceiling lifts, take the upgrade. When it doesn't, you still have a working thing.

And — because this is a new shape of risk — keep workloads portable. Not portable in the grand enterprise-architecture sense. Portable in the small, practical sense: the prompt isn't coupled to the vendor, the data path isn't coupled to one API, the switch cost between Anthropic and OpenAI on any given task is days rather than months. That way when a specific capability is gated, or a release slips, or a vendor pulls a model you'd come to rely on, it's a commercial conversation rather than a crisis.

We've known for a while that the frontier labs would eventually have to decide whether to release everything they build. We didn't know what that decision would look like in public. Now we do.

The curve is still going up. But for the first time it has a ceiling that isn't technical, and that's worth folding into the plan.

/ Start a conversation

Let's talk about what you're trying to build.

Book a discovery session and we'll walk through the workflow, the systems and the shape of the solution.