(no title)
IKantRead | 2 years ago
I would even add especially in Python. The main issue I have found is that pandas heavy code is just not as easy to integrate into other Python tools/features/abstractions as code using mostly numpy, dictionaries and various comprehensions to do the vast majority of your work.
As a heavy pandas user for several years, I decided about a year ago to not import pandas by default and instead treat most data problems like regular python problems. I've been genuinely surprised as how much easier it is to create useful abstractions with the code I've been writing, and also how much easier it's been to onboard non-DS devs into the code base.
There are a few obvious cases when Pandas is very helpful, and I'll pull it out in those places, but I've been able to do a tremendous amount of data work in the last year and used very little pandas. The result is that I have an actual codebase to work with now rather than a billion broken notebooks.
kristjansson|2 years ago
This is the biggest part. Giving yourself permission to make real abstractions, rather than forcing yourself to go directly from data-on-disk to pandas (or whatever) makes it that much easier to test, repeat, modify, and extend whatever analysis you're working on.
franklin_p_dyer|2 years ago
isoprophlex|2 years ago
(Note that in general, I'm the biggest pandas hater I know)
canjobear|2 years ago