Often it might be viable to implement prediction w/o necessarily implementing training (especially if there are published weights or a reference implementation). Not viable for papers where the key contribution is a change to the pre-training objective / training methodology / optimizer, but useful for papers where the key contribution is architectural.
No comments yet.