top | item 33939508

(no title)

Often it might be viable to implement prediction w/o necessarily implementing training (especially if there are published weights or a reference implementation). Not viable for papers where the key contribution is a change to the pre-training objective / training methodology / optimizer, but useful for papers where the key contribution is architectural.

discuss

No comments yet.