(no title)
lucasyvas | 5 months ago
I say this as a user of neither - just that I don’t see any inherent validity to that statement.
If you are saying Rust consumers want something lower level than you’re willing to make stable, just give them a higher level one and tell them to be happy with it because it matches your design philosophy.
bobbylarrybobby|5 months ago
``` df.with_column( map_multiple( |columns| { let col1 = columns[0].i32()?; let col2 = columns[1].str()?; let col3 = columns[3].f64()?; col1.into_iter() .zip(col2) .zip(col3) .map(|((x1, x2), x3)| { let (x1, x2, x3) = (x1?, x2?, x3?); Some(func(x1, x2, x3)) }) .collect::<StringChunked>() .into_column() }, [col("a"), col("b"), col("c")], GetOutput::from_type(DataType::String), ) .alias("new_col"), ); ```
Not much polars can do about that in Rust, that's just what the language requires. But in Python it would look something like
``` df.with_columns( pl.struct("a", "b", "c") .map_elements( lambda row: func(row["a"], row["b"], row["c"]), return_dtype=pl.String ) .alias("new_col") ) ```
Obviously the performance is nowhere close to comparable because you're calling a python function for each row, but this should give a sense of how much cleaner Python tends to be.
quodlibetor|5 months ago
I'm ignorant about the exact situation in Polars, but it seems like this is the same problem that web frameworks have to handle to enable registering arbitrary functions, and they generally do it with a FromRequest trait and macros that implement it for functions of up to N arguments. I'm curious if there are were attempts that failed for something like FromDataframe to enable at least |c: Col<i32>("a"), c2: Col<f64>("b")| {...}
https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...
https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...