Show HN: Watermelon – GPT-powered code contextualizer
100 points| baristaGeek | 3 years ago |watermelontools.com
We're starting with a VS Code extension that indexes information from git (GitHub, GitLab, or Bitbucket integrations available), Slack and Jira to explain the context around a file or block of code. Finally, we summarize such aggregated context using the power of GPT.
As devs we know that it's very annoying to look at a new codebase and start understanding all the nuances, particularly when the person who wrote the code already left the company. With this problem in mind, we decided to build this solution. You'll be able to get into "the ghost" of the person who left the company.
Soon, we will also be building a GitHub Action that does the same thing as the VS Code extension but at the time of creating a PR: Index the most relevant information related to this new PR, and add it as a comment. This way we will provide context at one more moment, and also, we will be making the IDE extension better.
Here's our open source repo if you also want to check it out: https://github.com/watermelontools/watermelon-extension
Please give us your feedback! Thanks.
[+] [-] flufferstutter|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
- Go to a file / highlight a block of code you want to understand at depth
- We take the commit hashes for the lines selected (the whole file if no LOC selected)
- We pass those commit hashes as a parameter to the GitHub/GitLab/Bitbucket API to obtain all the associated PRs
- We sort those PRs by relevance (using number of comments as the heuristic)
- With the title of that PR, we search for the Slack threads and Jira tickets most closely associated to that PR title (if you optinally integrated Slack and/or Jira)
- We aggregate to a GPT prompt the title and body of the most relevant piece of info from each source, to finally generate a summary of that code context.
[+] [-] woah|3 years ago|reply
You should really refine what you're asking for. I would like to use this for open source code and it should be able to do a great job, but this is crazy.
[+] [-] baristaGeek|3 years ago|reply
You can't limit which repos in an organization you're gonna give it access to, but you can limit which organizations you are going to give it access to.
Starting with an organization that has public repos is a good way of starting, indeed.
[+] [-] arjunlol|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
About the ideal size and security concerns: We've actually seen the most pushback from very small teams (5 or less engineers). I'm not sure why exactly, but my best guess is that as companies grow, it simply becomes more normal for them to give access to these tools and they see that nothing bad happens (they end up getting a lot of value actually).
However, we can't integrate with the self-hosted/enterprise versions of GitHub, VS Code/Slack, etc. yet which is what our potentially best customers in the future use.
Because of that, the ICP is engineering teams sized between 15 and 150
[+] [-] estebandalelr|3 years ago|reply
[+] [-] madamelic|3 years ago|reply
What are your thoughts on this issue and the future of these kinds of tools where teams have to hand over the key to the city on a ongoing basis and running on dev computers?
Most dev tools are isolated to a certain extent and aren't getting blank checks to the entire engineering department from top to bottom with access to external tools that detail business concerns. These tools seem like extremely ripe pickings for targeting for corporate espionage / hackers to 'pwn' companies.
[+] [-] baristaGeek|3 years ago|reply
In fact, we wrote this blog post where we talk about how we're building this without storing your code or passing it through our server at all: https://www.watermelontools.com/post/building-a-code-archeol...
You still have to give us read access to your GitHub, Slack, Jira, etc. Which is still asking for access to corporate info, but you know... people are very used to giving access to these tools via oAuth flows.
Regarding running on dev computers, we have one answer to that: Providing value as a GitHub Action. We still haven't launched it (it's gonna happen very soon), but our hypothesis is that by packaging the product in such format, we'll be able to address that very valid concern.
Thanks!
[+] [-] estebandalelr|3 years ago|reply
[+] [-] quickthrower2|3 years ago|reply
[+] [-] alfalfasprout|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
[+] [-] jeremyis|3 years ago|reply
Looks helpful! What else is on the roadmap?
[+] [-] baristaGeek|3 years ago|reply
Besides the GitHub (and its counterparts) Action, I can tell you that we have ideas for:
- A Discord integration
- Fine-tune GPT (with Git, Slack and Jira data) to be able to ask questions specific to your codebase
- An expansion to Intelli J
[+] [-] estebandalelr|3 years ago|reply
[+] [-] donpark|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
[+] [-] polishdude20|3 years ago|reply
[+] [-] andreshb|3 years ago|reply
Aside from an appliance in some local servers, what vendors do you use that solve this problem well ?
[+] [-] estebandalelr|3 years ago|reply
[+] [-] carlosagudelo|3 years ago|reply
[+] [-] kevmo314|3 years ago|reply
git blame with more steps?
[+] [-] baristaGeek|3 years ago|reply
Thanks for your comment. It's something we should be more clear about.
[+] [-] estebandalelr|3 years ago|reply
[+] [-] findnfund|3 years ago|reply
[+] [-] czc|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
[+] [-] Fedeconomist|3 years ago|reply
[+] [-] baristaGeek|3 years ago|reply
[+] [-] arnobio|3 years ago|reply
[deleted]