top | item 41323130

(no title)

roanakb | 1 year ago

Nice, seems like ML Productivity Goodput is a pretty well thought-out metric to understand the overall efficiency of your cluster. I'll consider adding this into our cluster management platform. Only potential drawbacks I'd guess are it being somewhat difficult to compute since it relies on metrics like MFUs, and not something we can observe layer-by-layer to understand inefficient kernels, but I'll take a deeper look. Thanks!

discuss

order

No comments yet.