(no title)
rolleiflex | 3 years ago
I'm currently running the 65B model just fine. It is a rather surreal experience, a ghost in my shell indeed.
As an aside, I'm seeing an interesting behaviour on the `-t` threads flag. I originally expected that this was similar to `make -j` flag where it controls the number of parallel threads but the total computation done would be the same. What I'm seeing is that this seems to change the fidelity of the output. At `-t 8` it has the fastest output presumably since that is the number of performance cores my M2 Max has. But up to `-t 12` the output fidelity increases, even though the output drastically slows down. I have 8 perf and 4 efficiency cores, so that makes superficial sense. At `-t 13` onwards, the performance exponentially decreases to the point that I effectively no longer have output.
gorbypark|3 years ago
dmw_ng|3 years ago
IIAOPSW|3 years ago
bee_rider|3 years ago
rolleiflex|3 years ago
I'm sure there are potential uses but training your own LLM would probably be more meaningfully useful versus running someone else's trained model, which is what this is.