WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 38573324

TinyChat: Large Language Model on the Edge

2 points| enduku | 2 years ago |hanlab.mit.edu

1 comment

order

enduku|2 years ago

TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.

Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat

powered by hn/api // news.ycombinator.com