Apple Just Subcontracted the Voice

At WWDC last Monday, Apple shipped the Siri it had promised for two years, but instead of building their own model, it thinks with Google's Gemini. Does that mean a decade-long case for owning the whole stack quietly ended?

Apple Just Subcontracted the Voice

On June 8, the keynote that opened WWDC unveiled "Siri AI", the rebuilt assistant Apple has been promising and delaying since 2024. The demo was really good! And it did all the things that we would EXPECT an AI should do in 2026. thought it was particularly interesting that the new Siri runs on Google's Gemini. Apple licensed a custom Gemini build of around 1.2 trillion parameters, is reportedly paying something close to a billion dollars a year for it, and quietly retired the ChatGPT hand-off that was the showpiece of the 2024 launch. The most tightly controlled hardware in consumer technology now does its hardest thinking on a competitor's model.

I want to be fair to the engineering, because it IS very good, and it is not the cartoon version where Apple ships your diary to Mountain View. Apple built a three-tier stack: simple requests stay on the device, moderately hard ones go to Apple's Private Cloud Compute, and only the heaviest reasoning routes out to Google Cloud, where the custom Gemini runs on E_SECRET_HARDWARE_BUT_PROBABLY_SOME_COMBINATION_OF_TPU_AND_NVIDIA. Queries that leave the phone are anonymized and tokenized so that, by Apple's account, neither Apple nor Google can tie a request back to a person. If you are going to rent a brain, this is close to the most careful way to wire it, and the byte-level privacy story mostly survives the announcement. That is not the part of the announcement that is interesting.

What the architecture used to say

In June 2024, Apple staked Apple Intelligence on a specific architectural claim. The premium property of an Apple model was that it ran on the device, your data never left, and the rare query that exceeded on-device capacity went to Private Cloud Compute, Apple's own hardware in Apple-controlled enclaves with cryptographic attestation. Third-party models were a fallback, available when you explicitly chose them. ChatGPT was the named partner; Gemini was discussed but not shipped. The hierarchy was on-device first, Apple's cloud second, somebody else's model last and only by choice.

The bet that Apple was making was their silicon team and their model team, running the same roadmap, would close the gap to frontier capability inside two years. The need for an outside frontier model was supposed to be temporary.

And in many ways it was! But the external world kept going even faster.

Why it widened

The on-device model Apple shipped in late 2024 was not the one the original pitch implied. Its capable cousin, the internal frontier model, slipped twice and landed in restructured form after the WWDC 2025 reorganization. Apple's foundation-model group lost senior people to Meta's superintelligence group and to Anthropic over the same stretch. Google, meanwhile, shipped Gemini 2.5, then 3.0, then 3.1 Pro on roughly a six-month clock, each one clearing a bar the last one missed. By early 2026 Apple's choices on the assistant had narrowed to two: ship a Siri that worked, or ship a Siri whose architecture matched the 2024 marketing. Monday told you which one Apple picked.

What actually changed

The thing that changed on Monday is not where your bytes go, because Apple engineered that fairly well. The thing that changed is who supplies the intelligence. For a decade Apple's entire argument, the one that justified designing its own chips and writing its own frameworks and refusing the easy integration, was that owning every layer of the stack was the only way to keep the promises it made about the device. On Monday Apple kept the assistant promise by renting the most important layer from the one company it competes with most directly across phones, ads, browsers, and now models. Their "we own the whole stack" became "we own the stack except the part that does the thinking," and you cannot attest your way out of that sentence.

Lots of folks are calling this a blow to "soverign AI"and, in the small and specific sense that matters to anyone who builds systems, it kind of is. Apple's most strategic consumer feature now carries a hard dependency on a competitor's model, a competitor's pricing, and a competitor's release schedule, and for the heaviest queries it runs inside a jurisdiction Apple does not control. Most users will never notice and most queries will never matter.

The biggest thing that changed here is the strategy that caused Apple's position movement, not any individual query. They admitted that the industry (and customer expectations) are moving too fast for them to keep up.

Right in physics, wrong in calendar

The on-device thesis was the architecturally correct answer to the question Apple was asking, where privacy by construction beats privacy by contract, and on-device latency beats a data-center round trip. Apple's silicon division spent ten years building the substrate that should have made on-device frontier intelligence a category.

However, the calendar call, and the rest of the world, missed. Apple bet its model team could reach the frontier as fast as its silicon team and product team could ship, and the frontier moved faster than any single company's roadmap. By the time an on-device path would have reached parity, Google had three more model generations out, OpenAI had four, and Anthropic had the tier jump that produced Mythos. Right on the physics, wrong on the calendar, and in product the calendar wins every time.

There is a pattern here that is going to define the next couple of years. The vertically integrated "own every layer" architecture is the correct answer to the long-horizon question about control. However, for a while anyway, it will lose to the federated "compose across whoever is best this quarter" architecture on the short-horizon question of what ships now.

The part to watch starts about eighteen months out. It might show up as Google's Gemini roadmap shipping on a clock that is inconvenient for Apple's launch calendar, or the billion-a-year tenancy gets renegotiated in a direction that pinches the Services margin team at Apple spent his tenure defending, or a Google policy change moves what Siri will and will not say, on a timeline that is not Apple's. None of that has happened yet, but it could, and it would cause a huge chasm. It's certainly uncharted waters (or at least uncharted for many years) for a company that previous prided it self on owning everything down to the silicon, wher now they have possibly huge decisions on a schedule Apple does not fully set.

Apple spent a decade telling you that owning the whole stack was the only way to keep a promise. On Monday it kept the promise by leasing the part that thinks.


Want to learn how intelligent data pipelines can reduce your AI costs? Check out Expanso. Or don't. Who am I to tell you what to do.

NOTE: I'm currently writing a book based on what I have seen about the real-world challenges of data preparation for machine learning, focusing on operational, compliance, and cost. I'd love to hear your thoughts!