top | item 39273947

(no title)

natnat | 2 years ago

We use a kind of funky BigQuery setup at my shop to reduce our spend, and I think it's based on a quirk in BigQuery pricing that Google doesn't explicitly recommend.

BigQuery has two completely separate pricing models: on-demand, and slot-based. The two models bill on completely separate things:

* On-demand pricing bills per GB of data processed by your query.

* Slot-based (or editions) pricing allocates your project a number of CPU/memory slots, and you pay per CPU-second.

You can find the costs each query would have by looking at the total_bytes_billed and total_slot_ms columns in the INFORMATION_SCHEMA.JOBS_BY_ORGANIZATION table, and multiplying those values by the slot-ms cost (total_slot_ms * 0.01111) and the bytes-billed cost (total_bytes_billed * 0.0000059). Then you can go through your queries and allocate them to either on-demand or slot-based pricing, depending on which is cheaper.

Usually slot-based is cheaper, but queries that do a lot of internal joins can have really huge CPU costs but cost very little in on-demand if they're not reading a lot of bytes.

Somewhat annoyingly, these billing models are configured at a per-project level; you can't switch between the two of them in a single project. Fortunately, you can query tables from other projects easily.

discuss

No comments yet.