(no title)
ach9l | 10 months ago
so i gave roo code a try, set a few test cases, and proceeded to declutter, refactor, rewrite the whole thing. i’ve never really written long apps in javascript nor typescript for that matter, and man, i just think 3k lines of code in a single file is just bad code, and i’ve been proven right. 3k lines fucks your context really good. you can’t use cline to code cline because it will ruin you financially one way or another. jesus fuckin’ christ the old cline.ts file was like responsible for the whole damn extension, over 3k lines, the kind of code i would write 10 years ago as an intern. anyway, i’ve added (and learned in the process) react.js components to have an interface to easily collect the data for my own loras. honestly if you are looking to integrate large local models into kilo, i’d love to collaborate. my forks mostly provide data analysis for the fine-tuning of my own personal repositories, using years of commit history as training data, even bash history. i’ve benchmarked several tasks. i can basically fork roo code or cline, declutter it, refactor it, with a gemma or qwq running in a mac studio for a few watts. i’ve been logging everything that i do ever since we were granted api access to gpt3 at a lab i coordinated about 5 years ago. so i’ve mastered the filtering of the completions api, reconstruction of streams, all using airflow and python scripts. i added a couple buttons such as the download task you’ve also added, but more along the lines of “send this to the batch in the datacenter so we train a new gemma” filtering good solutions vs not so good, the old thumbs up thumbs down situation, helps a lot, adding a couple of mcp integrations for applying quick loras locally, plus the addition of test driven development, aiming for reinforcement learning based loras. i built myself a very nice toy, or should i say, i bootstrapped a very nice tool that creates itself? anyway, thanks for sharing this.
i think the next major thing that is gonna happen with these tools is that it gets free at home as new chips become cheaper. llama 4 running in mac studios or dgx stations is as fast as you can get today and it is already good enough (if prepared correctly) to build any yc startup codebase from before covid, or even from before chatgpt, in a weekend. it will definitely happen. i’m wrapping fixing llama4 scout, allow me to mention the fact that it has a tendency to fix bugs by commenting code and adding TODOs, fucking great architecture though, just what we needed, i mean for optimal local development. i’ll try to publish results soon enough, optimized for the top mac studio though, haven’t got a dgx yet. i’ll prepare macbook versions too. the world needs more of this, a cline that fixes itself just on battery power...
jawon|10 months ago
ach9l|10 months ago
for mac studios i've found the sweet spot to be the largest gemma, up until llama scout was released, which fits the mac studio best. scout, although faster to generate, takes a while longer to fill in the long context, basically getting the same usability speeds as with the qwq or gemma 27b.
the refactoring is a test driven task that i've programmed to run by itself, think deep research, until it passes the tests or exhausts imposed trial limits. i've wrote it by instructing gemini, r1 and claude. in short, i've made gemini read and document proposals for refactoring, based on the way i code and strict architectural patterns that i find optimal for projects that handle both an engine and some views such as the react.js views that are present in these vscode extensions.
gemini pro gets it really well and has enough context capacity to maintain several different branches of the same codebase with these crazy long files without losing context. once this task is completed, training a smaller model based on the executed actions, (by that i mean all the tool use: diff, insert, replace and most importantly, testing) to perform the refactoring instructions is fairly easy.