top | item 42852300

(no title)

MyFirstSass | 1 year ago

Is this akin to the quants already being done to various models when you download a GGUF at 4 bits for example, or is this variable layer compression something new that can also be make existing smaller models smaller so we can fit more into say 12 or 16 gb's of vram?

discuss

order

No comments yet.