top | item 31521922

(no title)

mattlord | 3 years ago

This is at least the second time you've made this exact same comment on an article. So in an attempt to respond to what I will assume are arguments made in good faith...

A database stores and serves requested data. Pricing is based on your usage of this service — how much data are you requesting it to serve in this case. It does not matter if a piece of data is cached or not, it's a database service not a block device service — and certainly not one with an independent cache in front of it that is somehow magically free to operate. Are you insinuating that e.g. Redis should always be free because its keys/records are in memory? I'm sorry, but I fail to see the logic in your argument here.

Of course ELT jobs consume resources. You're using the service, and you're paying based on your usage. This is simply how serverless / usage based models work. Again, I fail to follow the logic here.

innodb_rows_read is a count of the rows that MySQL (the query execution layer) reads from InnoDB (via the storage engine API). This does not count reads from internal / system / data dictionary tables as that is covered by innodb_system_rows_read. It's not "a bug" that it is what it is — InnoDB internally reads (b+ tree) pages and index records (as it uses index organized tables), and that's what happens with Index Condition pushdowns, it can apply query predicates directly as it's examining the index leaf nodes; it's only when pushing results back up to MySQL through the storage engine interface that these InnoDB records get converted to MySQL's generic row format (which includes going from InnoDB's arch independent big endian format to system/little endian format). This metric is telling you exactly what it should (not a bug) — how many rows does MySQL read from InnoDB — there are other handler and InnoDB specific internal metrics that you can use if you want other numbers (e.g. information_schema.innodb_metrics). JFG's blog post was (correctly) noting that if you as an system operator are trying to calculate the full cost of a query then you cannot rely solely on how many rows MySQL reads from InnoDB but instead look at how much work InnoDB does (which is much more challenging as you have b-tree traversal and maintenance, MVCC overhead, prefretching, etc involved) — and ultimately how many system resources are used in total (I/O amplification being one factor with InnoDB's index organized tables and update-in-place model). So parsing and optimizing costs, how many bytes are read from disk, pages from memory, was a temp table used, was a sort file used, etc. Again, this innodb_rows_read metric's intended behavior/meaning is NOT a bug and this metric will always favor the user for usage based billing. Your insinuation that this is somehow buggy and cannot be trusted as a metric so beware... here I would say that you are simply mistaken. This was an intentional decision made to offer simplicity, transparency, and to benefit users (we also go through great efforts to subtract rows read due to internal operations, just as MySQL itself does).

Hopefully this helps to allay your concerns.

discuss

No comments yet.