(no title)
veritas3241 | 5 years ago
There's so much they can do from a user experience perspective to make it even better. The integration with Numeracy was a trainwreck, but the fundamentals of the DB are there.
Interesting to see they lose so much money, but I bet their margins have to be so thin running on the cloud. I wonder if they'll ever have to go bare metal to make it work.
FridgeSeal|5 years ago
Working with it was fraught with issues. Performance was mediocre at best, it was horribly expensive, Python and JS client libs had re-occurring issues with disconnecting and reconnecting. The advice given to us around scaling concurrent connections was bizarre at best. Teammates had numerous issues where it was clear corners had been cut in handling some edge cases around handling certain unicode characters. Their Snowpipe "streaming" implementation was...not good. The idea of having having compute workers that "spun up and down" sounded good in theory, but in practice lead to more bottlenecks and delays than anything else.
The AWS outage last year that prevented you from provisioning new instances essentially crippled our snowflake DB.
I almost go out of my way to recommend people _not_ use it. I keep seeing it pop up, but mostly because it seems they're doing what Mongo DB did in the early days and just throw marketing money to capture mindshare as opposed to being an actually good product.
We changed to ClickHouse and the difference was literally night-and-day. The performance especially was far superior.
veritas3241|5 years ago
dataminded|5 years ago
They are always going to be less integrated and less infrastructure-cost-efficient than the native options (Redshift and BigQuery), without the R&D budgets and with incremental friction (sales) and risk (data privacy and cybersecurity).
AWS really should get around to buying them, like they should have bought Looker or Tableau or Mode or Fivetran or DBT, etc, ect.
bpodgursky|5 years ago
Like, in a sane world I agree with you -- Redshift SHOULD have a crazy competitive advantage. But somehow they've been unable to execute on that goal for half a decade, and I don't see that changing quickly, given Snowflake's mindshare and growth.
hodgesrm|5 years ago
Example: you can play inside ball on storage infrastructure costs to get a 2x cost benefit at the expense of a lot of extra engineering. Better DBMS storage organization, which is available to any implementation, gets you 10x (or greater) improvement. Which would you rather have?
In fact, products like Redshift don't even really game the infrastructure prices. Costs to customers are comparable with Snowflake for equivalent resources as far as I can tell. They both charge what the market will bear.
manigandham|5 years ago
manigandham|5 years ago
deepGem|5 years ago
I fail to understand this network effect. Is there any conflation here ? How does data sharing equate to network effect. Something is fundamentally not adding up here. If I share my data with 10 other customers, it should inherently enhance my experience. How does this happen with Snowflake ?
suhel|5 years ago
1) Building a common platform to upload datasets by anyone. e.g. weather data, retail data, govt data, other open data, or close data (copyright etc). They gave the example of COVID cases in their S-1 doc.
2) Providing mechanism for others to find data through a marketplace; some data is free, other only via payment (with diff monetisation models, e.g. per consumption, per month). Allow other customers to consume it as & when needed. Note, based on their S-1 doc, data is never copied when shared with others, so cost is limited to share with a wide audience.
3) More data on the platform, more data is shareable in the 'marketplace' and more data used by everyone. This increases the value of the whole platform through network effects.
4) Also opens up alternative revenue streams. e.g. more revenue through storage (more data on platform from different people). and revenue from shared data that is consumed (maybe)
Here is a company that is doing something similar in Australia. https://www.datarepublic.com/solutions/use-cases/data-collab...
veritas3241|5 years ago
tyingq|5 years ago
afpx|5 years ago
jwatte|5 years ago
If you have 80-100% utilization for a month, perhaps, but the beauty of Snowflake is that you can spin up a 3XL warehouse for a few MINUTES to get answers fast, and then shut it down again and don't pay anything.
Saying "you could run it on self-managed Spark/Oracle/Hive/SQLite" is approximately the same argument as saying "I can run a web server cheaper myself than paying Amazon for an EC2 instance" -- there are cases where that is true, but there are many, many, cases where the "on demand capacity" is the bigger benefit.
unknown|5 years ago
[deleted]