top | item 45026163

(no title)

letaem77 | 6 months ago

This is my way to do it:

1. Archive whole repository into single text file with Repopack: https://github.com/yamadashy/repomix

2. To reduce token, compress the file with LLMLingua-2: https://github.com/microsoft/LLMLingua

(fewer token = more context can be given to LLM = LLM better understands your repository)

3. Copy & Paste the compressed archive text contents as a context, into ChatGPT’s input field as-is, or local LLMs.

4. Ask the LLM for documentation generation. Like, “this is a repository source code. given context, generate a ‘table-of-content’.” Then you will get a ToC. If it looks good, you can ask for generating first chapter. And keeps going until you finish whole documentation.

If you are trying to document Typescript/Javascript codebase, You may use bundlers like esbuild for step 2, which will beneficial for token reducing.

If you interested in step 2’s LLMLingua-2, check out my Typescript port that can be ran without no installation at: https://atjsh.github.io/llmlingua-2-js/

discuss

No comments yet.