(no title)
sampo | 1 month ago
For better or worse, like Excel and like the simpler programming languages of old, Pandas lets you overwrite data in place.
Prepare some data
df_pandas = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [10, 20, 30, 40, 50]})
df_polars = pl.from_pandas(df_pandas)
And then df_pandas.loc[1:3, 'b'] += 1
df_pandas
a b
0 1 10
1 2 21
2 3 31
3 4 41
4 5 50
Polars comes from a more modern data engineering philosopy, and data is immutable. In Polars, if you ever wanted to do such a thing, you'd write a pipeline to process and replace the whole column. df_polars = df_polars.with_columns(
pl.when(pl.int_range(0, pl.len()).is_between(1, 3))
.then(pl.col("b") + 1)
.otherwise(pl.col("b"))
.alias("b")
)
If you are just interactively playing around with your data, and want to do it in Python and not in Excel or R, Pandas might still hit the spot. Or use Polars, and if need be then temporarily convert the data to Pandas or even to a Numpy array, manipulate, and then convert back.P.S. Polars has an optimization to overwite a single value
df_polars[4, 'b'] += 5
df_polars
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1 ┆ 10 │
│ 2 ┆ 21 │
│ 3 ┆ 31 │
│ 4 ┆ 41 │
│ 5 ┆ 55 │
└─────┴─────┘
But as far as I know, it doesn't allow slicing or anything.
richardbachman|1 month ago
I believe it is just "syntax sugar" for calling `Series.scatter()`[1]
> it doesn't allow slicing
I believe you are correct:
You can do: Perhaps nobody has requested slice syntax? It seems like it would be easy to add.[1]: https://github.com/pola-rs/polars/blob/9079e20ae59f8c75dcce8...
goatlover|1 month ago
thijsn|1 month ago
pandas is write-optimized, so you can quickly and powerfully transform your data. Once you're used to it, it allows you to quickly get your work done. But figuring out what is happening in that code after returning to it a while later is a lot harder compared to Polars, which is more read-optimized. This read-optimized API coincidentally allows the engine to perform more optimizations because all implicit knowledge about data must be typed out instead of kept in your head.
unknown|1 month ago
[deleted]
thereisnospork|1 month ago