top | item 36091887

(no title)

emikulic | 2 years ago

Last time I looked into this, the answer was "because huggingface transformers and torch.load are written to do it this way"

You could absolutely do something streaming, or mmap the weights instead of loading them into system RAM. Just the default interfaces don't.

discuss

order

No comments yet.