top | item 42054896

(no title)

For decode steps it depends on the number of inputs you run at a time. If your batch size is 1 then it runs in line with active params, then as you get to like batch size 8 it runs in line with all params, then as you increase to 128ish it runs like the active params again.

For the context encode it’s always close to as fast as a model with a similar number of active params.

For running on your own the issue is going to be fitting all the params on your gpu. If you’re loading off disk anyways this will be faster but if this forces you to put stuff on disk it will be much slower.

discuss

No comments yet.