top | item 45388919

(no title)

snake_doc | 5 months ago

Um.. the model is tiny: https://github.com/thinking-machines-lab/manifolds/blob/main...

discuss

order

jasonjmcghee|5 months ago

Yeah, it's just the wrong architecture for the job, so I found it to be a strange example.

Here's the top model on DAWNBench - https://github.com/apple/ml-cifar-10-faster/blob/main/fast_c...

Trains for 15 epochs and it, like all the others is a 9 layer resnet.

srean|5 months ago

Usually there's more to a ML, data-science idea (that's not a full fledged fledged out journal paper) than beating a SOTA benchmark.

In fact beating SOTA is often the least interesting part of an interesting paper and the SOTA-blind reviewers often use it as a gatekeeping device.