top | item 44059486

(no title)

Mortiffer | 9 months ago

The R community has been hard at work on small data. I still highly prefer working on on memory data in R dplyr DataTable are elegant and fast.

The CRan packages are all high quality if the maintainer stops responding to emails for 2 months your package is automatically removed. Most packages come from university Prof's that have been doing this their whole career.

discuss

order

wodenokoto|9 months ago

A really big part of a in-memory dataframe centric workflow is how easy it is to do one step at a time and inspect the result.

With a database it is difficult to run a query, look at the result and then run a query on the result. To me, that is what is missing in replacing pandas/dplyr/polars with DuckDB.

IanCal|9 months ago

I'm not sure I really follow, you can create new tables for any step if you want to do it entirely within the db, but you can also just run duckdb against your dataframes in memory.