Distributed Thoughts
  • Home
  • About
Sign in Subscribe

Data Engineering

A collection of 2 posts
From Pandas to Upstream Control: The Evolution PyData Needs Next
Data Engineering

From Pandas to Upstream Control: The Evolution PyData Needs Next

The Python data stack gave us Pandas, Dask, and Arrow. But we've created a new crisis: the ingest-it-all-first tax that's drowning us in noise.
08 Nov 2025 5 min read
kubeflow

From Kubeflow to Real-World ML: Why Data Locality Matters Just as Much as Compute

From Kubeflow to Real-World ML: Why Data Locality Matters More Than Compute When my co-founders, Jeremy Lewi, Vishnu Kannan, and I started Kubeflow back in 2017, we were trying to solve what felt like the biggest problem in machine learning. Brilliant data scientists would craft elegant models on
24 Jul 2025 3 min read
Page 1 of 1
Distributed Thoughts © 2026
  • Sign up
Powered by Ghost