top | item 40641405

(no title)

spothedog1 | 1 year ago

I'm very interested in this. I'm a Software Engineer who's been doing some Data Science on the side and been looking for something like this.

My current set up is running Jupyter on an EC2 instance and using inside PyCharm. One feature I actually really value is being able to use it directly in PyCharm as I can have my IDE on one side of split screen and my browser on the other. Not sure how feasible it is to integrate something like this into an IDE, VSCode would work

But a real killer feature that could get me to switch to a browser based would be the ability to load custom context about the data I'm working with. So I have all my datasets and descriptions of all their columns in my own database and would love a way to load that into the LLM so that it has a greater understanding of the data I'm working with in the notebook.

I store all my data in objects called `distributions` [1] and have a `get_context()` function that will return a text blob of things like dataset description, column description, types, etc.

The issue with all these auto-code AI tools is they don't really have a good grasp of the actual data domain and I want to inject my pre-made context into an LLM thats also integrated in my notebook.

[1] https://www.w3.org/TR/vocab-dcat-3/#version-history

discuss

order

spothedog1|1 year ago

Following up: A reason I really like using Jupyter in PyCharm is because Github CoPilot works in it which helps a lot.