(no title)
pid-1 | 2 months ago
Everything it does can be done reasonable well with list comprehensions and objects that support type annotations and runtime type checking (if needed).
Pandas code is untestable, unreadable, hard to refactor and impossible to reuse.
Trillions of dollars are wasted every year by people having to rewrite pandas code.
mttpgn|2 months ago
The thousand-plus data integrity tests I've written in pandas tell a different story...
mulmboy|2 months ago
I see this take somewhat often, and usually with similar lack of nuance. How do you come to this? In other cases where I've seen this it's from people who haven't worked in any context where performance or scientific computing ecosystem interoperability matters - missing a massive part of the picture. I've struggled to get through to them before. Genuine question.
jononor|2 months ago
That said, the polars/narwals style API is better than pandas API for sure. More readable and composable, simpler (no index) and a bit less weird overall.
jmpeax|2 months ago
isolatedsystem|2 months ago
Pandas insist you never use a for loop. So, I feel guilty if I ever need a throwaway variable on the way to creating a new column. Sometimes methods are attached to objects, other times they aren't. And if you need to use a function that isn't vectorised, you've got to do df.apply anyway. You have to remember to change the 'axis' too. Plotting is another thing that I can't get my head around. Am I supposed to use Pandas' helpers like df.plot() all the time? Or ditch it and use the low level matplotlib directly? What is idiomatic? I cannot find answers to much of it, even with ChatGPT. Worse, I can't seem to create a mental model of what Pandas expects me to do in a given situation.
Pandas has disabused me of the notion that Python syntax is self-explanatory and executable-pseudocode. I find it terrible to look at. Matlab was infinitely more enjoyable.
radus|2 months ago
Regarding your plotting question: use seaborn when you can, but you’ll still need to know matplotlib.
kelipso|2 months ago
I pretty much consider anyone who likes it to have Stockholm syndrome.
fifilura|2 months ago
A for loop is a lot about the "how" but apply, join etc are much closer to the "what".
wesleywt|2 months ago
globular-toast|2 months ago
physicsguy|2 months ago