top | item 42019537

(no title)

vhhn | 1 year ago

Unlike many other languages, R has a native/built-in tabular data structure. So when your data have tabular structure R is by far the best glue for building pipes between external libreries. If the data fits in RAM it literally doesn't have to leave the data.table object throughout the whole pipeline (including all the cleaning and transformations).

The only meaningful alternative I see is Python with maybe Polars or DuckDB.

discuss

order

stevenae|1 year ago

Fellow R / Polars user here. How (in what applications) do you use DuckDB?

tylermw|1 year ago

DuckDB is great for medium data: Too big for memory, but small enough to fit on local storage. It's also extremely performant for loading data and supports a wide range of storage backends. It's also really well integrated with R and can really speed up certain queries, as long as the DuckDB engine can translate them to valid SQL.