top | item 39181762

Show HN: Tabby back end in 20 Python lines (self-hosted AI coding assistant)

3 points| vsolina | 2 years ago |github.com

Today I've made an anticomplex re-implementation of backend for Tabby - self-hosted AI coding assistant. (seems fully functional, but some issues are expected)

motivation: I had to move the one I use to another server, but did not want to re-install the entire toolchain and all dependencies.

additionally I like simple stuff which I can understand

2 comments

order

wsxiaoys|2 years ago

Nice implementation! It should serve as a great reference for a minimal Tabby's backend API. Thank you for sharing it!

Yeah - ultimately, it won't be as performant or feature-rich compared to https://github.com/TabbyML/tabby, but it's still perfect for educational purposes!

vsolina|2 years ago

Thank you Meng for building Tabby and providing us with a self hosted alternative to copilot! I absolutely love it! Keep up the amazing work.

You're definitely right about the feature richness, but the truth is I just want completions :D

Performance is a funny thing, mostly scales with the slowest part of the system. Since both servers use the same inference lib (llama.cpp) which does all the heavy lifting, there's essentially no completion performance difference in the single user mode according to my tests. Because I use a smaller model by default (Q5_K_M instead of Tabby's Q8, ~30% difference in size), and LLM inference is essentially memory bandwidth bound: my new deployment is around 30% faster with no noticeable quality difference on identical hardware.

p.s. I'd highly recommend providing additional quantization methods in your model repository to make it easier for novice users.

Thank you