top | item 46998156

(no title)

hodapp | 17 days ago

The creator of Pandas even bashes it: https://wesmckinney.com/blog/apache-arrow-pandas-internals/

discuss

order

imtringued|17 days ago

He missed talking about the poor extensibility of pandas. It's missing some pretty obvious primitives to implement your own operators without whipping out slow for loops and appending to lists manually.

fud101|17 days ago

have these 'improvements' been backported to pandas now? i would expect it to close the gap over time.

benrutter|17 days ago

Yes (mostly) is the answer. You can use arrow as a backend, and I think with v3 (recently released) it's the default.

The harder thing to overcome is that pandas has historically had a pretty "say yes to things" culture. That's probably a huge part of its success, but it means there are now about 5 ways to add a column to a dataframe.

Adding support for arrow is a really big achievement, but shrinking an oversized api is even more ambitious.