top | item 42867346 (no title) monobot12 | 1 year ago If you don't mind a speed of 1 token per second, you can run the largest R1 model on a 2021 iMac, as I just did. discuss order hn newest btbuildem|1 year ago Largest R1, as in the 671B? How do you accomplish that feat? oynqr|1 year ago Just do it? Llama.cpp doesn't load the entire thing into ram. It mmaps the file and the kernel takes care of the rest. jeffbee|1 year ago Are we speaking of a 2020-edition Intel 27" iMac or a 2021 M1?
btbuildem|1 year ago Largest R1, as in the 671B? How do you accomplish that feat? oynqr|1 year ago Just do it? Llama.cpp doesn't load the entire thing into ram. It mmaps the file and the kernel takes care of the rest.
oynqr|1 year ago Just do it? Llama.cpp doesn't load the entire thing into ram. It mmaps the file and the kernel takes care of the rest.
btbuildem|1 year ago
oynqr|1 year ago
jeffbee|1 year ago