Intelligence is becoming a tiered utility
The era of the single, massive model is giving way to something more structured.
We are seeing a shift from chasing raw scale to perfecting specialized reasoning. It is no longer just about how much a model knows, but how it can act—specifically as an agent that can plan, iterate, and coordinate across long-horizon tasks.
OpenAI is testing this through its new GPT-5.6 series. Instead of one flagship, they have introduced three tiers: Sol for deep reasoning, Terra for balanced work, and Luna for speed and low cost. Terra, for instance, aims to match GPT-5.5 performance while costing half as much.
The real push is in agentic capability. With new 'max' reasoning modes and 'ultra' modes that use subagents to accelerate work, these models are targeting high-stakes fields like biology and cybersecurity. We see this in their performance on benchmarks like Terminal-Bench 2.1 and GeneBench v1.
But higher capability brings higher risks. To prevent these models from being used for actual cyberattacks, OpenAI is layering on heavy safeguards—including real-time classifiers and 700,000 A100-equivalent GPU hours of automated red-teaming.
Because the stakes are so high, they are also coordinating this rollout with the U.S. government, starting with a limited preview for a small group of trusted partners.
Builders will have to move away from 'one size fits all' prompting and start picking specific model tiers based on the exact balance of reasoning, speed, and cost required for a task.