The History of Expanso (Part 4): The Mismatch
The below is a continuation of the series on the history of Expanso. Today, we're talking about one of the three unchangeable laws of data - its unrelenting growth. Read the whole series starting from Part 1.
The History of Expanso (Part 4): The Mismatch
There's a moment in every big engineering review where the polite fiction of the architecture diagram gives way to the messy reality of the world. I was in one of those meetings with a global consumer app wrestling with physics. Their goal: sub-fifty-millisecond latency for millions of users, from North America to APAC. The app was ready to go, but it assumed that networking would not be an issue. The user experience was sluggish, and their cloud bills were starting to look like a rounding error in the federal budget.
As we picked at the problem, I suggested "All you have to do is accelerate the speed of light!" I was joking, but this was the core problem. The playbook they were using was designed for a different era.
A Rocket and a Staircase
We are no longer dealing with a manageable stream of information; we are dealing with a structural mismatch between two fundamental curves.
The first curve is the creation of data, which grows exponentially. It's a line that curves toward the vertical, driven by a relentless explosion of sensors, logs, and AI. The second curve is our ability to move that data. It's a line that slopes upward, but it remains stubbornly linear. We can light more fiber, but we cannot double the world's network capacity every two years.
One curve is a rocket; the other is a staircase. You cannot win that race.
The High Cost of a Broken Playbook
This reality has quietly transformed the "big data" problem into a "moving data" problem. The consequences are not theoretical; they are a cascade of practical, painful symptoms.
- Unsustainable Costs: Egress fees cease to be a line item and become a strategic liability. A single misconfigured job can blow the budget for a month.
- Killed Developer Velocity: Teams wait hours, or even days, for massive datasets to copy before a single line of their own code can run, grinding momentum to a halt.
- System Fragility: A regional network hiccup can cause backpressure that stalls critical infrastructure half a world away, making the entire system brittle.
- Unmanageable Complexity: Managing consistency, enforcing schemas, and maintaining security across countless network hops introduces a thousand new ways for things to fail.
A New Playbook: Move Compute, Not Data
If moving the data is the core problem, then the only sustainable solution is to limit moving it. The fix isn't a bigger pipe; it's a different architecture. The mental shift is from data movement to data locality.
This "compute-over-data" model is built on a few simple principles:
- Filter and Aggregate at the Source: Move only the high-signal, summarized results, not the raw data.
- Push Code to the Data: Instead of pulling petabytes back to a central cluster, push portions of your data pipelines out to where the data lives.
- Maintain Global Control: Use a global control plane for scheduling, policy, and audit, so that distributed execution doesn't descend into chaos.
You don't beat exponential growth; you change what you are forced to move.
From Theory to Practice
With the global app for that customer, this is precisely what we did. We re-architected their pipeline to execute fast-twitch signals next to the users for a near-instant response, while only slower, asynchronous updates trickled back from the central systems. The results were immediate: fewer long-haul transfers, fewer retries, and latencies that were finally hitting their targets.
This is the foundational principle behind Expanso. We built a platform to make this architecture a practical reality. Expanso gives you a single control plane to run your data pipelines securely where your data already lives, whether that's on the edge, on-premises, or across multiple clouds. You keep your environments and your security posture; we push the work to them and bring back only what you need. The goal is to eliminate the transfer tax and the latency penalty from fighting an unwinnable war against physics.
I'm interested in your perspective—at what point did data transfer costs or latency stop being a tactical issue and become a strategic problem for your organization?