Distributed Thoughts
  • Home
  • About
Sign in Subscribe

Data Engineering

A collection of 2 posts
From Pandas to Upstream Control: The Evolution PyData Needs Next
Data Engineering

From Pandas to Upstream Control: The Evolution PyData Needs Next

The Python data stack gave us Pandas, Dask, and Arrow. But we've created a new crisis: the ingest-it-all-first tax that's drowning us in noise.
08 Nov 2025 5 min read
kubeflow

From Kubeflow to Real-World ML: Why Data Locality Matters Just as Much as Compute

From Kubeflow to Real-World ML: Why Data Locality Matters More Than Compute When my co-founders, Jeremy Lewi, Vishnu Kannan, and I started Kubeflow back in 2017, we were trying to solve what felt like the biggest problem in machine learning. Brilliant data scientists would craft elegant models on their laptops,
24 Jul 2025 3 min read
Page 1 of 1
Distributed Thoughts © 2025
  • Sign up
Powered by Ghost