top | item 40633773

Show HN: Thread – AI-powered Jupyter Notebook built using React

162 points| alishobeiri | 1 year ago |github.com

Hey HN, we're building Thread (https://thread.dev/) an open-source Jupyter Notebook that has a bunch of AI features built in. The easiest way to think of Thread is if the chat interface of OpenAI code interpreter was fused into a Jupyter Notebook development environment where you could still edit code or re-run cells. To check it out, you can see a video demo here: https://www.youtube.com/watch?v=Jq1_eoO6w-c

We initially got the idea when building Vizly (https://vizly.fyi/) a tool that lets non-technical users ask questions from their data. While Vizly is powerful at performing data transformations, as engineers, we often felt that natural language didn't give us enough freedom to edit the code that was generated or to explore the data further for ourselves. That is what gave us the inspiration to start Thread.

We made Thread a pip package (`pip install thread-dev`) because we wanted to make Thread as easily accessible as possible. While there are a lot of notebooks that improve on the notebook development experience, they are often cloud hosted tools that are hard to access as an individual contributor unless your company has signed an enterprise agreement.

With Thread, we are hoping to bring the power of LLMs to the local notebook development environment while blending the editing experience that you can get in a cloud hosted notebook. We have many ideas on the roadmap but instead of building in a vacuum (which we have made the mistake of before) our hope was to get some initial feedback to see if others are as interested in a tool like this as we are.

Would love to hear your feedback and see what you think!

44 comments

order

RamblingCTO|1 year ago

Uncalled landing page roast: The landing page needs some serious overhaul. No one cares if it's written in react. And AI in and of itself is not a feature. Tell me what I can do with it.

The demo is pretty nifty! I have the suspicion that for more complex things it will stumble, but I'll give it a try and fine-tune layout ML with a custom dataset or something that's more complex than survivors in the titanic dataset.

Oh and the API key/proxy thingy sounds a bit annoying.

jupp0r|1 year ago

Slightly off topic feedback: find a different name. There are too many products named Thread or Threads out there, it's impossible to Google and doesn't convey much information about your particular tool.

spothedog1|1 year ago

I'm very interested in this. I'm a Software Engineer who's been doing some Data Science on the side and been looking for something like this.

My current set up is running Jupyter on an EC2 instance and using inside PyCharm. One feature I actually really value is being able to use it directly in PyCharm as I can have my IDE on one side of split screen and my browser on the other. Not sure how feasible it is to integrate something like this into an IDE, VSCode would work

But a real killer feature that could get me to switch to a browser based would be the ability to load custom context about the data I'm working with. So I have all my datasets and descriptions of all their columns in my own database and would love a way to load that into the LLM so that it has a greater understanding of the data I'm working with in the notebook.

I store all my data in objects called `distributions` [1] and have a `get_context()` function that will return a text blob of things like dataset description, column description, types, etc.

The issue with all these auto-code AI tools is they don't really have a good grasp of the actual data domain and I want to inject my pre-made context into an LLM thats also integrated in my notebook.

[1] https://www.w3.org/TR/vocab-dcat-3/#version-history

spothedog1|1 year ago

Following up: A reason I really like using Jupyter in PyCharm is because Github CoPilot works in it which helps a lot.

mrlinx|1 year ago

how do you use pycharm with your own jupyter instance?

namin|1 year ago

This seems cool! Is there a way to try it locally with an open LLM? If you provide a way to set the OpenAI server URL and other parameters, that would be enough. Is the API_URL server documented, so a mock a local one can be created?

Thanks.

alishobeiri|1 year ago

Definitely, it is something we are super focused on as it seems to be a use case that is important for folks. Opening up the proxy server and adding local LLM support is my main focus for today and will hopefully update on this comment when it is done :)

ramesh31|1 year ago

Can we stop calling things Thread? I can't think of a single more overused name. Impossible to Google as well.

alishobeiri|1 year ago

Hahaha I take the point, I thought I was slick because I got the thread.dev domain and so didn't consider SEO as much. One of the alternate titles we considered was `Show HN: Thread.dev` instead of just `Show HN: Thread` to minimize overlap but last second opted against it. Anyways appreciate the feedback.

mritchie712|1 year ago

Are you thinking Thread would be an open-source alternative to Hex (https://hex.tech)?

I was thinking of doing something like this last year, but I couldn't figure out a good business model. Google Colab is cheap (free, $10 per month) and Hex isn't that expensive (considering the compute cost they need to cover).

If you focus on local, you're going against VS Code and Jupyter. Both are free and very good.

alishobeiri|1 year ago

It's something we are considering, I think Hex provides a lot of features that aren't available in existing local notebooks (SQL, reactive cell execution) that we hope to integrate for sure. I think both Jupyter and VS Code are really strong players in the space so one of the concerns we have was around whether the feature set would be compelling enough to get people to switch. (Which is why we wanted to post to test the initial reaction :))

The reason we wanted to focus on running things locally is that we were both engineers at big companies in the past, and we didn't have access to tools like Hex but we could use local tools. Our initial thesis is to bring the best development experience local and see if there is an opportunity to build a business model around collaboration features.

pplonski86|1 year ago

Congrats on launching! What is your ideal target user? What is the hardest use case that is solved with thread.dev that cant be solved with existing tools? What is the architecture of thread? Frontend is in react and you start jupyter server in background?

titodini|1 year ago

What's the benefit of using this instead of GitHub copilot + notebooks inside of VScode?

agarwa90|1 year ago

I think being able to use your own LLM is def a plus.

hamasho|1 year ago

This is very interesting!

One thing I really want but missing in Jupyter is a straightforward auto-completion integrated with something like Copilot. I'm spoiled by the "just-mashing-Tab development", where I just type a few words and let auto-complete do the rest.

The lack of auto-completion is the main reason I prefer using VS Code or Neovim recently over Jupyter even for experiments.

itishappy|1 year ago

> Best of all, Thread runs locally, and can be used for free with your own API key.

That doesn't sound very local...

What are the benefits of running the notebook infrastructure locally when your data is being processed in the cloud? Can it be isolated to just code? Can I point this at a local db of customer information to workshop some SQL?

itishappy|1 year ago

Unfortunately, these responses aren't clearing much up for me. I may be dumb, so here's two dumb questions:

1. Am I correct in assuming the "API key" here is an OpenAI API key?

2. Can this tool be pointed at local models?

HumblyTossed|1 year ago

Please consider changing the name to something that is more searchable.

politelemon|1 year ago

I feel this suggestion may get traction if posted on a certain twitter alternative or a certain slack alternative.

collyw|1 year ago

The title sounds like a mish mash of hacker news buzzwords.

animanoir|1 year ago

AI-Powered is starting to get boring...

gudzpoz|1 year ago

The GIFs look good and I like how it lets the user diff between original code and AI-generated ones. But still, I would like to quote from an article on Thread (well, this Thread is a network stack) [1]:

> No, seriously. Can we please not name new things using terms that are already widely used? I hate that I have to specify whether I’m talking about sewing, screwing, parallel computing, a social network from Meta, or a networking stack. Stop it.

[1] https://overengineer.dev/blog/2024/05/10/thread/#user-conten...

alishobeiri|1 year ago

Hahahah point taken, in all fairness, I only knew about Threads (the social media app) when I posted and so I thought it wouldn't overlap. I was worried the word `thread` might overlap in a parallelism sense though wasn't aware of the networking stack. Anyways I appreciate the feedback :)