top | item 46722862 (no title) JonChesterfield | 1 month ago I see a lot of references to `device_map="cuda:0"` but no cuda in the github repo, is the complete stack flash attention plus this python plus the weights file, or does one need vLLM running as well? discuss order hn newest No comments yet.
No comments yet.