top | item 46722182

(no title)

streamer45 | 1 month ago

All good, cheers!

discuss

order

schopra909|1 month ago

Per the RAM comment, you may able to get it run locally with two tweaks:

https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...

1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM

2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations

dsrtslnd23|1 month ago

Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that.