top | item 38779988

(no title)

hoelle | 2 years ago

Intriguing. My feedback for whoever is writing this, the main site, the github Readme, etc, please put in some work on a clarity/simplification pass.

discuss

order

factormeta|2 years ago

Likewise. In specific to: >In practice, when used as a data warehouse, SPL does show different performance compared with traditional solutions. For example, in an e-commerce funnel analysis scenario, SPL is nearly 20 times faster than Snowflake even if running on a server with lower configuration; in a computing scenario of NAOC on clustering celestial bodies, the speed of SPL running on a single server is 2000 times faster than that of a cluster composed of a certain top distributed database. There are many similar scenarios, basically, SPL can speed up several times to dozens of times, showing very outstanding performance.

Wow that really sounds amazing? Just wonder how a java based db can out perform Snowflake (a columnar base db). Maybe the original implementation in Snowflake is not optimal? Then again, from personal experience h2 embedded mode significantly faster than plain postgres.

Judyrabbit|2 years ago

This post(https://blog.scudata.com/a-major-culprit-in-the-slow-running...) explains why java-based SPL can run much faster than the C++-based database. BTW, SPL also support columnar storage, it can implement columnar storage in a single file. And here is a test report https://blog.scudata.com/spl-computing-performance-test-seri.... It sounds amazing, but it is not mysterious. A lot of low complexity algorithms can not be implemented in SQL, programmer can only expect the optimizer of database. howerver ,when SQL is complex, optimizer would get lost.

Judyrabbit|2 years ago

This is a little difficult for SPL,SPL is a little versatile. For example, it can be used as middleware to solve mixed computing over multiple data sources, Implement hot-swap microservices, Substitute stored procedure, Act as a data warehouse for high performance, As a computing engine for implementing the true lakehouse, accompany with OLTP database to achieve low-risk HTAP, ..., It can even be used as an Excel plugin to help with desktop analysis. Because computing is everywhere! Everyone only cares about their own issues, that's a fraction of what SPL is used for. But we can't predict what people will care when they come to the homepage, we have to list a little bit of everything, so, the home page is somewhat cluttered. Simply skip the items you are not interested in and read the links to the items you are interested in. Thanks very much.