aptxkid's comments

uvdn7 | 4 years ago | on: Snowflake’s response to Databricks’ TPC-DS post

I genuinely think DeWitt clause is good for the users (bad for researchers). Without it, especially in the context of cooperate competitions, the company with the most marketing power will win. Users can always compare different products themselves. I am likely wrong but please help me understand.

uvdn7 | 4 years ago | on: Snowflake’s response to Databricks’ TPC-DS post

I disagree. It makes sense for Snowflake to response to what-they-think-is an unreasonably bad result published by Databricks. And they focused more on Snowflake’s result and only compared dollar cost against Databricks. It’s consistent with their philosophy that public benchmark war is beside the point and mostly a distraction.

uvdn7 | 4 years ago | on: Snowflake’s response to Databricks’ TPC-DS post

It’s probably just me but the distinction between datalake and data warehouse seems like splitting hairs. Unstructured data can always be stored on structure databases. What’s the main reason for both to coexist?

uvdn7 | 4 years ago | on: Snowflake’s response to Databricks’ TPC-DS post

Whose result can be trusted is beside the point - I actually believe both experiments were likely conducted in good faith but with incomplete context. But that’s beside the point. The point is there’s no good reason to start a benchmark war to begin with.

aptxkid | 4 years ago | on: Snowflake’s response to Databricks’ TPC-DS post

Personally I think it’s a great response and very well written. I didn’t jump on the congrats-Databricks wagon when the result first came out because of the weird front page comparison against snowflake. Both companies are doing great work. Focusing on building a better product for your customer is much more meaningful than making your competitor look bad.

aptxkid | 4 years ago | on: Ask HN: Why are relational DBs are the standard instead of graph-based DBs?

It does, which one can argue it’s an implementation detail. The differences I mentioned above eg sharding, transaction boundary, secondary indices, etc. should be generally applicable to a non-relational db and not unique to TAO. I could be wrong though; as I know little about other graph dbs.

It would probably be more productive to compare RDB and Graph with certain workload examples (OLTP, OLAP, joins, scans, etc.).

aptxkid | 4 years ago | on: Ask HN: Why are relational DBs are the standard instead of graph-based DBs?

This is a great topic. I have been working on TAO for many years. I am not very familiar with other graph databases; I assume fundamentally they are more or less the same. Here are some differences between RDB and a graph db IMHO, 1. sharding and transaction boundary 2. Secondary index support 3. How “join” works (eg. give me a list of my friends who follows Justin Bieber)

You’re right that graph db is very easy to use a lot of the times.

aptxkid | 4 years ago | on: Why C++ Is Terrifying

The tweet is certainly loaded, hence the point of this HN post lol.

First of all there's std::function, which uses Type Erasure https://blog.the-pans.com/type-erasure/. It means std::function<void(int)> can be the type of anything callable that takes an int and returns void (lambda, function pointer, object with operator() overload, etc.). Notice they are of different types! Hence Type Erasure.

How std::function manages its memory is poorly specified. But the standard at least states that if it's initialized from a function pointer (free, no capture), it's guaranteed that it won't allocate. https://en.cppreference.com/w/cpp/utility/functional/functio...

> When the target is a function pointer or a std::reference_wrapper, small object optimization is guaranteed, that is, these targets are always directly stored inside the std::function object, no dynamic allocation takes place.

In this case, std::function is trivially copyable. However, there's no way to know this at compile time, exactly because the type is erased in std::function.

page 1