top | item 38573324 TinyChat: Large Language Model on the Edge 2 points| enduku | 2 years ago |hanlab.mit.edu 1 comment order hn newest enduku|2 years ago TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat
enduku|2 years ago TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat
enduku|2 years ago
Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat