top | item 30337628

(no title)

1 points| kvnkho | 4 years ago

discuss

order

kvnkho|4 years ago

Hi HN,

I am one of the contributors of Fugue. Fugue is an open-source abstraction layer that ports Python/Pandas/SQL code to Spark or Dask. This article covers the programming interface and benefits Fugue provides, specifically:

* Handling inconsistent behavior between different compute frameworks (Pandas, Spark, and Dask) * Allowing reusability of code across Pandas-sized and Spark-sized data * Dramatically speeding up testing and and iteration cycles * Enabling new users to be productive with Spark much faster * Providing a SQL interface capable of handling end-to-end workflows

There was a previous post on here about our SQL interface that lets you use SQL on top of Pandas, Spark and Dask. This post talks about the broader project. https://news.ycombinator.com/item?id=28830243

Our repo can be found here: https://github.com/fugue-project/fugue

Happy to answer any questions!