top | item 44714497

Show HN: Pure CUDA C Inference for Qwen3 0.6B in One File, No Dependencies

1 points| yb0000 | 7 months ago |github.com

discuss

order

yb0000|7 months ago

[deleted]