top | item 25319296

(no title)

identity0 | 5 years ago

I wish more languages gave us a "|>" operator. Too many languages settle with dot notation, which confuses encapsulation/method calling with syntactical convenience.

discuss

order

civilized|5 years ago

This, along with inferior metaprogramming affordances, is the biggest reason pandas will never be as productive an analyst tool as R's dplyr. In R you can pipe anything into anything. In pandas you're stuck with the methods pandas gives you.

This is also why pandas API is so bloated. A lot of pandas special-case functionality, like dropping duplicate rows, can be replicated in R by chaining together more flexible and orthogonal primitives.

kaitai|5 years ago

As someone who is now bouncing back & forth between Python and R on a weekly basis, I've been surprised (after making fun of R sometimes) how much I miss the piping when I leave R for Python. Pandas seems so inflexible by comparison, so nitpicky for little gain. I've been surprised again and again how much dplyr supports near-effortless fluency and productivity.

Never really thought I'd be writing that in a public forum.

smabie|5 years ago

I think the obvious limitations of Python is a big reason, but probably not the main reason why Pandas isn't orthogonal. The reason why Pandas is such an ungodly mess is because it must be, in order to be even halfway efficient. When you do try to compose things, or even have the audacity to use a python lambda or an if statement or whatever, you suddenly suffer a 100x slowdown in performance.

Julia doesn't have these problems, and I've found it so much nicer to use for data analysis. You can even call Python libs, if you really have to.