top | item 44587331

(no title)

scoresmoke | 7 months ago

You might also consider a fast implementation of Elo and Bradley–Terry that I have been developing for some time: https://github.com/dustalov/evalica (Rust core, Python bindings, 100% test coverage, and nice API).

discuss

order

swyx|7 months ago

would you consider JS bindings? should be easy to vibe code given what you have. bonus points if it runs in the browser (eg export the wasm binary). thank you!

scoresmoke|7 months ago

I am thinking about this for a while and I think I’ll vibecode them. Not sure about WASM, though, as the underlying libraries should support it, too, and I am not sure about all of them at the same time.

npip99|7 months ago

In our case training and inferencing the models takes days, calculating all of the ELOs take 1min haha. So we didn't need to optimize the calculation.

But, we did need to work on numeric stability!

I have our calculations here: - https://hackmd.io/@-Gjw1zWMSH6lMPRlziQFEw/B15B4Rsleg

tldr; wikipedia iterates on <e^elo>, but that can go to zero or infinity. Iterating on <elo> stays between -4 and 4 in all of our observed pairwise matrices, so it's very well-bounded.

scoresmoke|7 months ago

I am working on post-training and evaluation tasks mostly, and I built Evalica as a convenient tool for my own use cases. The computation is fast enough to not bother the user, but the library does not stand in my way during the analysis.