When the Incumbents Make Your Argument
Dell Technologies, the company that built the infrastructure for centralization, just made the case for distributed AI. When the incumbents start arguing your position, you are either vindicated or about to be absorbed. Maybe both.
If you've spent any time in the distributed systems community, you know the feeling. You make a technical argument for years, back it with data, point to the architectural inevitability of it, and get polite nods from people who then go buy more centralized infrastructure. Then one day a Fortune 50 company publishes your argument under their own letterhead, and suddenly it's a serious idea.
That's what happened earlier this week when Dell Technologies published "The Power of Small: Edge AI Predictions for 2026." The thesis, written by Dell Fellow Daniel Cummins, is that the future of AI is small, local, distributed, and specialized. Not centralized. Not massive. Not dependent on round-trips to hyperscaler data centers.
This is Dell. The company that has sold more server and storage hardware into centralized data centers than arguably any other company in history. When they start making the case against centralization, it's worth understanding what changed.
Five Predictions, One Pattern
Cummins lays out five predictions for 2026, and every single one points in the same direction:
Small language models will overtake LLMs in enterprise usage. Gartner predicts that by 2027, organizations will use task-specific SLMs three times more often than general-purpose LLMs. Cummins frames these as "Micro LLMs" that require less compute, less power, and live on devices.
Distributed data centers will replace monolithic ones. With 75% of enterprise-managed data now created outside traditional data centers, organizations are building smaller, specialized IT environments near where data is generated.
Computer vision at the edge will go from pilot to production. Lightweight CV models running on edge hardware, doing real-time inference without cloud connectivity.
Agentic AI will move from cloud to edge. This is the prediction that caught my eye. Cummins writes that we'll see "a shift from centralized, cloud-based systems dependent on massive data centers to edge-resident agents that will handle local decisions and closed-loop actions."
Physical AI will require edge deployment. Robots, autonomous systems, and industrial automation can't tolerate the latency of cloud inference for safety-critical decisions.
If you read these predictions without looking at the byline, you'd think they came from ... I don't know ... me? The fact that they came from a Dell Technologies Fellow tells you how far the Overton window has moved.
The Economics Got Too Loud to Ignore
Why is Dell saying this now? It's not because they suddenly discovered the beauty of edge computing. Dell has been selling edge hardware for years. What changed is the economics of AI inference became impossible to hand-wave away.
The numbers are striking. A mid-sized enterprise running 10,000 daily customer queries through GPT-5 APIs pays roughly $4.2 million per month. Deploy a self-hosted 7B-parameter SLM on an A10G GPU? Under $1,000 monthly. That's not a marginal improvement. That's a 99.98% cost reduction. Per-token economics tell the same story: frontier model APIs at $30 per million tokens versus self-hosted SLMs at $0.12 to $0.85. A 79x differential.
At those ratios, the conversation isn't about whether to move inference to the edge. It's about how fast you can get there.
And this is before you factor in the other costs that never appear in the API pricing: data transfer fees, latency penalties, compliance complexity, and the organizational cost of being dependent on a vendor whose pricing and model quality can change without notice. Menlo Ventures' enterprise data shows OpenAI's enterprise market share eroding from 50% to 34% to 27% over the last three years. Anthropic climbed from 12% to 40% in the same period. Google tripled from 7% to 21%. When the provider landscape is this volatile, betting your production workloads on a single centralized API is a strategic liability, not just a cost problem.
What Dell Gets Right
The most interesting part of Cummins' piece isn't any individual prediction. It's the underlying logic connecting them. He's describing a system where models get smaller, infrastructure gets distributed, agents become local, and intelligence moves to where the data already lives.
That's not five separate trends. That's one trend: the center of gravity for AI compute is shifting from centralized clouds to distributed edge environments. Smaller models make edge deployment feasible. Distributed infrastructure makes it manageable. Local agents make it useful. And the economics make it inevitable.
Dell's own CTO John Roese put it directly: "Running models locally, on premises or in controlled AI factories, will become the norm to provide a stable foundation and insulate organizations from external disruptions." That's a remarkable statement from the CTO of a company that sells cloud infrastructure alongside its edge products. He's acknowledging that for AI workloads, the pendulum is swinging back toward local.
The Gartner data backs this up across multiple dimensions. By 2028, 30% of generative AI workloads will shift to domain-specific SLMs running on-premises or on-device, up from less than 1% in 2024. 73% of organizations are actively moving AI inferencing to edge environments for energy efficiency. Domain-specific GenAI models will represent over 50% of enterprise deployments by 2027, up from 1% in 2023.
That's a 50x shift in four years. The word "seismic" gets overused, but this qualifies.
What Dell Gets Wrong
There's a tension in the Dell piece that Cummins doesn't resolve, and it's the same tension that shows up in every incumbent's edge AI strategy.
At the end of the post, Cummins pitches Dell NativeEdge as "a full-stack solution that securely centralizes the deployment, orchestration, and lifecycle management of diverse infrastructure and applications."
The truth is there's nothing directly wrong with this. Centralization often provides a lot of benefit, a single threat to choke so to speak, but it can also be an issue with the conductor metaphor I wrote about two weeks ago. The workloads are ants, but the management layer is still an orchestra conductor.
The problem is that centralized orchestration of distributed workloads reintroduces the very dependencies the distributed approach was supposed to eliminate. If your edge agents can't function when the management plane is down, you haven't really moved intelligence to the edge. You've moved the compute to the edge and kept the brains in the cloud. That's better than pure centralization, but it's not the architectural shift the economics are demanding.
The more interesting approaches emerging right now treat orchestration itself as distributed, where agents coordinate through shared context rather than centralized instruction, models update through federated learning rather than centralized push, and infrastructure self-organizes based on local conditions rather than waiting for commands from a control tower. We'll see how long that takes to get going but I'm optimistic it's sooner rather than later
Vindicated or Absorbed
The economics of AI inference are structurally incompatible with full centralization. Full stop. The models are getting small enough to run locally. The data already lives at the edge. The latency requirements for agents and physical AI demand local processing. The cost differential between cloud API calls and local inference is almost two orders of magnitude. And regulatory pressure around data sovereignty is only increasing.
None of this is new to anyone who's been paying attention to distributed systems. What's new is that Dell is saying it, with Gartner data to back it up and a Fortune 50 company's credibility behind it.
There's a particular experience that comes with watching incumbents adopt your argument. It's validating, obviously. But it also comes with a catch: incumbents adopt the framing while preserving the architecture. They'll tell you the future is distributed and then sell you a centralized control plane to manage it. The words change. The topology doesn't.
The real question isn't whether AI moves to the edge. Dell just told you it will. The question is what the orchestration layer looks like when it gets there. And on that question, the industry is still reaching for the same centralized patterns that created the problems in the first place.
Want to learn how intelligent data pipelines can reduce your AI costs? Check out Expanso. Or don't. Who am I to tell you what to do.*
NOTE: I'm currently writing a book based on what I have seen about the real-world challenges of data preparation for machine learning, focusing on operational, compliance, and cost. I'd love to hear your thoughts!