top | item 31190376

Explorer – A library to bring dataframes to Elixir

124 points| ahamez | 3 years ago |cigrainger.com | reply

24 comments

order
[+] jasonpbecker|3 years ago|reply
Flow/Broadway + how nice Ecto is already had me thinking, "Elixir is super interesting for data engineering." Explorer and Livebook have me thinking, "Elixir has the best shot at unseating Python."
[+] 3jckd|3 years ago|reply
It's interesting what/when/if will unsettle Python. And what the adoption would be like.

Julia has been <designed> to unsettle Python in the data space but to no avail.

¯\_(ツ)_/¯

[+] rch|3 years ago|reply
Factor in Rust integration via Rustler and it starts to look like a 'when not if' dynamic.
[+] sergiomattei|3 years ago|reply
Doubt it. Elixir is a leaky abstraction over Erlang, so OTP concepts are a huge barrier of entry for non-software engineers.

e.g. to this day starting a small library requires learning supervisors, genservers, etc. Unless you’re doing a one off .exs, you’re going to be really confused by what “mix new” does.

Saying this as a passionate Elixirist — there’s nothing wrong with the leaky abstraction part, it’s what makes it powerful — but dethroning Python isn’t happening anytime soon.

[+] tfp137|3 years ago|reply
I thought it would be Clojure, but that was 10 years ago and Python is still (inexplicably?) dominant.
[+] ahamez|3 years ago|reply
Added '(Elixir and Polars)' to the title as 'Introducing Explorer' alone is not clear enough in my opinion.
[+] rapnie|3 years ago|reply
> we’ve been using Elixir in production since 2018. We switched over to LiveView in early 2020 and haven’t looked back.

What does "switched to LiveView" mean? I thought LiveView was some collaborative, live documentation / interactive tutorial tool. Or does it provide a whole programming environment and/or development process?

[+] Existenceblinks|3 years ago|reply
I think you mean https://livebook.dev/ which is built on top of LiveView. LV can be used instead of "Dead View" which is a non diff-tracking view.
[+] pbowyer|3 years ago|reply
> Second, there are callbacks! The second argument in DataFrame.filter/2 takes a callback function against the dataframe. There’s some cool stuff we can do with that, and I think callbacks against the dataframe are a natural way to work.

I'm not enamoured with how DataFrame.filter/2 reads compared to the dplyr example. I've written very little R but read code by others and find dpylr excellent for that; and I use a JS 'port' of dplyr [1] for projects. I'm in the "hate pandas" camp and find it painful to both read and write - and these callback functions are coming too close to Pandas for my liking. I hope an alternative can be found

1. https://pbeshai.github.io/tidy/

[+] throwamon|3 years ago|reply
For further inspiration, this is a pretty good-looking "dplyr for Python": https://github.com/machow/siuba

There's precedent in Ecto for a "magic" operator (`^` if I'm not mistaken), so it wouldn't be a stretch to implement it here as well.

[+] peoplefromibiza|3 years ago|reply
Amazing!

Rustler precompiled[1] is exactly what I was looking for to start writing NIFs in Rust.

Having to bring in the entire Rust toolchain stopped me every time I thought about it - not mentioning that at my current day job I'm not completely at liberty of deciding what tools I can/cannot use.

Combined with Livebook[2] this is the perfect combo to lure into the Elixir ecosystem my fellow data scientists colleagues.

[1] https://dashbit.co/blog/rustler-precompiled

[2] https://livebook.dev/

[+] pdimitar|3 years ago|reply
> Having to bring in the entire Rust toolchain

I relate to the general sentiment, mind you, but Rust is brainlessly easy to install.

But even if you seriously don't want to even do that (two shell commands) you can still pull a Docker image and work inside it -- but that's a bit more involved, granted.

I applaud the Rustler Precompiled effort because apparently it lowers the barrier to entry for many -- yourself included -- but at least for me installing Rust was never an actual problem.

As for Explorer, I have a few friends that curse at Python and Pandas every week, I might be able to "sell" them this project instead!