(no title)
letaem77 | 6 months ago
1. Archive whole repository into single text file with Repopack: https://github.com/yamadashy/repomix
2. To reduce token, compress the file with LLMLingua-2: https://github.com/microsoft/LLMLingua
(fewer token = more context can be given to LLM = LLM better understands your repository)
3. Copy & Paste the compressed archive text contents as a context, into ChatGPT’s input field as-is, or local LLMs.
4. Ask the LLM for documentation generation. Like, “this is a repository source code. given context, generate a ‘table-of-content’.” Then you will get a ToC. If it looks good, you can ask for generating first chapter. And keeps going until you finish whole documentation.
If you are trying to document Typescript/Javascript codebase, You may use bundlers like esbuild for step 2, which will beneficial for token reducing.
If you interested in step 2’s LLMLingua-2, check out my Typescript port that can be ran without no installation at: https://atjsh.github.io/llmlingua-2-js/
No comments yet.