top | item 45127393

(no title)

ritchie46 | 5 months ago

He means that he wants our Rust library as easy as our Python lib. Which I understand as our focus has been mostly on Python.

It is where most of our userbase is and it is very hard for us to have a stable Rust API as we have a lot of internal moving parts which Rust users typically want access to (as they like to be closer to the metal), but has no stability guarantees from us.

In python, we are able to abstract and provide a stable API.

discuss

order

lucasyvas|5 months ago

I understand the user pool comment but don’t understand why you wouldn’t be able to have a rust layer that’s the same as the Python one API-wise.

I say this as a user of neither - just that I don’t see any inherent validity to that statement.

If you are saying Rust consumers want something lower level than you’re willing to make stable, just give them a higher level one and tell them to be happy with it because it matches your design philosophy.

bobbylarrybobby|5 months ago

The issue with Rust is that as a strict language with no function overloading (except via traits) or keyword arguments, things get very verbose. For instance, in python you can treat a string as a list of columns as in `df.select('date')` whereas in Rust you need to write `df.select([col('date')])`. Let's say you want to map a function over three columns, it's going to look something like this:

``` df.with_column( map_multiple( |columns| { let col1 = columns[0].i32()?; let col2 = columns[1].str()?; let col3 = columns[3].f64()?; col1.into_iter() .zip(col2) .zip(col3) .map(|((x1, x2), x3)| { let (x1, x2, x3) = (x1?, x2?, x3?); Some(func(x1, x2, x3)) }) .collect::<StringChunked>() .into_column() }, [col("a"), col("b"), col("c")], GetOutput::from_type(DataType::String), ) .alias("new_col"), ); ```

Not much polars can do about that in Rust, that's just what the language requires. But in Python it would look something like

``` df.with_columns( pl.struct("a", "b", "c") .map_elements( lambda row: func(row["a"], row["b"], row["c"]), return_dtype=pl.String ) .alias("new_col") ) ```

Obviously the performance is nowhere close to comparable because you're calling a python function for each row, but this should give a sense of how much cleaner Python tends to be.

tomtom1337|5 months ago

Ah, of course. Slightly ambiguous English tricked me there. Thank you Ritchie!

sureglymop|5 months ago

I apologize for that, English isn't my first language. Glad it was explained so well!