top | item 45388919 (no title) snake_doc | 5 months ago Um.. the model is tiny: https://github.com/thinking-machines-lab/manifolds/blob/main... discuss order hn newest jasonjmcghee|5 months ago Yeah, it's just the wrong architecture for the job, so I found it to be a strange example.Here's the top model on DAWNBench - https://github.com/apple/ml-cifar-10-faster/blob/main/fast_c...Trains for 15 epochs and it, like all the others is a 9 layer resnet. srean|5 months ago Usually there's more to a ML, data-science idea (that's not a full fledged fledged out journal paper) than beating a SOTA benchmark.In fact beating SOTA is often the least interesting part of an interesting paper and the SOTA-blind reviewers often use it as a gatekeeping device. load replies (1)
jasonjmcghee|5 months ago Yeah, it's just the wrong architecture for the job, so I found it to be a strange example.Here's the top model on DAWNBench - https://github.com/apple/ml-cifar-10-faster/blob/main/fast_c...Trains for 15 epochs and it, like all the others is a 9 layer resnet. srean|5 months ago Usually there's more to a ML, data-science idea (that's not a full fledged fledged out journal paper) than beating a SOTA benchmark.In fact beating SOTA is often the least interesting part of an interesting paper and the SOTA-blind reviewers often use it as a gatekeeping device. load replies (1)
srean|5 months ago Usually there's more to a ML, data-science idea (that's not a full fledged fledged out journal paper) than beating a SOTA benchmark.In fact beating SOTA is often the least interesting part of an interesting paper and the SOTA-blind reviewers often use it as a gatekeeping device. load replies (1)
jasonjmcghee|5 months ago
Here's the top model on DAWNBench - https://github.com/apple/ml-cifar-10-faster/blob/main/fast_c...
Trains for 15 epochs and it, like all the others is a 9 layer resnet.
srean|5 months ago
In fact beating SOTA is often the least interesting part of an interesting paper and the SOTA-blind reviewers often use it as a gatekeeping device.