When You Can't Find the Bug: Architecting Around Production Issues

Mon, 24 Nov 2025 22:00:00 +0100

This is Part 2 of a series. Read Part 1: Pandas vs Polars in Production - Performance Comparison for the background on the Polars migration.

After migrating from Pandas to Polars, CPU performance improved dramatically—but a memory problem persisted. Despite extensive debugging, I couldn’t identify the root cause. So I made a pragmatic decision: architect around it.

This is the story of splitting a monolithic Python application into a Go orchestration service with Python workers, not because I fully understood the problem, but because I needed production to be stable.

Pandas vs Polars in Production: Performance Comparison

Sun, 23 Nov 2025 23:02:39 +0100

When performance bottlenecks started affecting my production data pipeline, I decided to test whether Polars could deliver on its performance promises. This is what I learned from migrating a real production workload from Pandas to Polars.

The Workload

The application was a data aggregation service running as a Kubernetes pod with the following constraints:

Resources: 2 CPUs, 3 GB RAM
Execution frequency: Every 2-2.5 minutes
Data volume: 5,000-7,000 rows × 100-150 columns per run
Operations: Multiple database calls, API requests, DataFrame merges, arithmetic operations (additions, multiplications), and group-by aggregations
Web server: FastAPI with Uvicorn handling production traffic

All operations were properly vectorized-no row-by-row iteration. The pipeline combined data from various sources into a single DataFrame, transformed it, and output the results.

Data-Processing on Asadbek Kurbonov

When You Can't Find the Bug: Architecting Around Production Issues

Pandas vs Polars in Production: Performance Comparison

The Workload