(no title)
sandblast2 | 2 months ago
The mongodb instance is ephemeral, the database itself is ephemeral, both only exist while the script is running which can be measured in seconds. The structure is changing from month to month. All this plays to the strengths of mongodb while avoiding the usual problems. For eg one stage of the aggregate pipeline can only be 100MB? A source csv is a few megabytes at most.
Ps.: no, Excel can't do it, I got involved with this when the complexity to do it in Excel has become unbearable.
solatic|2 months ago
https://duckdb.org/docs/stable/data/csv/overview
https://duckdb.org/docs/stable/sql/functions/aggregates
cpursley|2 months ago
unknown|2 months ago
[deleted]