Great benchmark, very interesting. Although, I am not sure about the extrapolation of the H200 from the lambda bench. From my understanding, Lambda and theirs bench used different models - LLama 405B and Mistral 123B - with different bench and inference libs. Since the study is focused on memory-hungry scenario, I am really curious to know why they took H100 instead of H200.
bihan_rana|1 year ago
mufasachan|1 year ago