top | item 35496558

Show HN: A ChatGPT TUI with custom bots

126 points| jlowin | 3 years ago |askmarvin.ai | reply

Hi HN! We just shipped a full-featured TUI (Text User Interface) for chatting with your Marvin bots, powered by GPT 4 or GPT-3.5. Like all of Marvin, it's fully open-source and we hope you find it useful. To launch it, upgrade and run `marvin chat`.

The TUI is built with Textual (https://github.com/textualize/textual/) and uses some of its newest features including background workers and modals. We've made base TUIs before but this is the first one that's a true "app" with many screens and coordinated global state. Happy to answer any questions about working with Textual - once it "clicked" it was surprisingly similar to building a traditional front end! Small note: Terminal.app on MacOS isn't great for TUIs, so while it'll work, we suggest an alternative terminal.

One of our goals with the TUI was to integrate Marvin's bots into the familiar chat UX. Bots can have distinct personalities, instructions, and use plugins, so each one is like a mini "conversational application." You might know about Marvin because of AI Functions, but at its core Marvin is a library for building and deploying bots (in fact, AI functions are actually a bot!). We started building the TUI as a way to quickly explore and assess our bots' capabilities. It quickly became so useful that we decided to make it a first-class experience.

We've preloaded several bots, including one that can guide you through an RPG and another that is obsessed with explaining regex, and will add many more. You can even create your own bots just by asking the default bot (Marvin) to help you.

We hope the TUI is a fun way to quickly interact with your bots and it was a great way for us to learn Textual. Please check out the code and let us know what enhancements we can add!

51 comments

order
[+] darkteflon|3 years ago|reply
I hadn’t heard of Marvin before but it looks interesting. At first I thought this TUI was just a convenience interface on top of the ChatGPT web service but it’s quite a bit more than that.

Having dug into the docs for ten minutes, the library as a whole seems to be in the same space as langchain. Some of the starting abstractions are similar but overall it seems to take a higher-level approach with a focus on clarity and convenience. Will definitely try this out. Would also love to hear more about the origins and philosophy if any maintainers are about!

Edit: As an aside: I have found Textual quite difficult to get to grips with in the past. Too much magic, maybe. Does anyone know of any good alternatives at a similar level of abstraction? I don’t want to get down in the weeds to knock up a simple TUI.

[+] jlowin|3 years ago|reply
Sure! We wrote a little bit about the origins in an announce post a couple weeks ago (https://news.ycombinator.com/item?id=35366838).

Marvin (https://www.github.com/prefecthq/marvin) powers our AI efforts at Prefect (https://www.github.com/prefecthq/prefect).

The first version of Marvin was an internal framework that powered our Slackbot. There are close to 30,000 members of our open-source community and we rely heavily on automation to deliver support. Then, as more of our customers started building AI stacks, we began to view Marvin as a platform to experiment with high-level UX for deploying AI. We have a few internal use cases, but it was the diversity of customer objectives that gave us confidence.

Historically, we've always focused on data engineering, but the more we worked with LLMs, the more we saw the same set of issues, basically driven by the need to integrate brittle, non-deterministic APIs that are heavily influenced by external state into well-structured traditional engineering and pipelines. We started using Marvin to codify the high-level patterns we were repeatedly deploying, including getting structured outputs from the LLM and building effective conversational agents for B2B use.

The lightbulb moment was when we designed AI functions, which have no source code and essentially use the LLM as a runtime. It's one of those ideas that feels too simple to actually work... but it actually works incredibly well. It was the first time we felt like we weren't building tools to use AI, but rather using AI to build our tools. We open-sourced with AI functions as the headline and the response has been amazing! Now we're focused on releasing the "core" of Marvin -- the bots, plugins, and knowledge handling -- with a similar focus on usability.

Hope that's what you were looking for!

[+] jquery|3 years ago|reply
As a developer eagerly awaiting the chance to access the GPT-4 API, I can't help but express my growing frustration with the waitlist system. I understand the need for a gradual rollout to ensure server stability and mitigate misuse, but it feels like it's been an eternity since I signed up.

The potential of GPT-4 is truly game-changing, and seeing all the amazing projects other developers have built is just adding to the anticipation. It's disheartening to be left on the sidelines while others seem to be getting access and capitalizing on these opportunities.

I believe a more transparent approach to the waitlist would go a long way in alleviating some of this frustration. If we had a better idea of where we stand in the queue or an estimated time for access, it would make the waiting game more bearable. As it is, we're left in the dark, wondering if we'll ever get the chance to dive into this powerful tool.

In the meantime, it's back to refreshing my email inbox and cursing my luck. Hoping for a more equitable distribution of access soon, so that all of us excited developers can start bringing our ideas to life with GPT-4.

[+] rashkov|3 years ago|reply
I would contact support and check with them. I was given access to gpt4 with 8K context API within a few hours of requesting it. Maybe I got super lucky but my guess is something may have gone wrong with your request or the email notification. Have you tried using your API keys with gpt4 API requests? Maybe it already works? Best of luck, hope it gets resolved
[+] halfjoking|3 years ago|reply
Yeah me too - all I can do now is build around gpt-3.5-turbo and assume the responses will be similar to what I get with my GPT-4 with my Plus membership.

You’d think since I pay for Plus and API credits I’d get access to GPT-4 but nope.

[+] replwoacause|3 years ago|reply
Sorry this has been your experience. I signed up and was given access the next day and I've barely even used it. Hopefully you get in soon.
[+] pmoriarty|3 years ago|reply
You should be able to just buy GPT4 access through poe.com's iOS app (and even get limited free access to GPT4 through the poe.com website)
[+] jaimehrubiks|3 years ago|reply
Does it work with chatgpt plus credentials, or does it need an API token?
[+] rpastuszak|3 years ago|reply
I just got a ChatGPT subscription and quickly realised that it might be much cheaper for me to use the API with an alternative client. Is there any downside to using the API besides not being able to access the default UI?

(when Dall-e 2 came out I saved 7-8x by writing a custom front-end and using the API keys instead of buying tokens, I'm a cheap bastard)

[+] jlowin|3 years ago|reply
It does need an API token - it will ask you for it when you start if you haven’t stored one already.
[+] avereveard|3 years ago|reply
I would love something with this sophistication for the web, currently most offer just a chat on top of api keys and if you are very lucky multiple threads, didn't find one yet with personality and agents and plugins
[+] jlowin|3 years ago|reply
We’ve done some work on a full UI for Marvin! Some things work great in a terminal, others really need the flexibility of the web.
[+] aymeric|3 years ago|reply
Personally I find the idea of having a Bot Manager quite interesting. I imagine future APIs could be boys that talk to each other, and this Bot Manager would allow humans to join the conversation
[+] jlowin|3 years ago|reply
Exactly! Threads in Marvin are designed to support multiple bots and users. Two key user stories:

- multiple users in a Slack thread talking to the same bot. This is something we want to deliver soon, as Marvin powers our existing Slack bots

- one user addressing multiple bots, each of which is designed for a specific purpose (because bots do way better with reduced scope than when you have one bot try to do everything)

[+] evo_9|3 years ago|reply
I’m surprised nobody has yet that I’ve seen made a talking ui for ChatGPT.

Aka you speak to it your question and it speaks back its answer while writing to the screen. Or maybe I’ve missed something like that?

[+] refulgentis|3 years ago|reply
Yeah, quite a few out there. As long as you can write an OpenAI API integration and integrate with browser apis for TTS & transcription, you're set. Probably 20-30 hours total for an implementation.
[+] tacone|3 years ago|reply
Bing (based on ChatGPT) does that, at least on mobile, if you talk to it then it will talk back to you.
[+] aymeric|3 years ago|reply
How would you use it?

I built such voice assistant for myself, but found that audio is a limiting medium.

[+] hyperfuturism|3 years ago|reply
I think TUI tends to be overrated. They're not very practical, and mostly end up as a novelty application.

Regardless, I respect the effort in building this - great work.

[+] breakfastduck|3 years ago|reply
I disagree. I'm not a huge user, but there are many people I work with who want to work as exclusively with keyboard / inside a terminal pane as possible.

So there are definitely plenty of people out there for whom this stuff isn't just a novelty.

It also then simplifies workflow for those who are SSHing into a machine they have control over etc..

[+] dror|3 years ago|reply
Well, if you're interested in something more lite-weight, I wrote

https://github.com/drorm/gish

which is a shell command that lets you interact with GPT with flags, pipes, etc. in a much more unixy way.

This TUI has some impressive features, like the bots and plugins, but I feel gish covers most of the use cases, specifically for software development.

[+] alecfreudenberg|3 years ago|reply
using GPT to define agents like this is such an exciting opportunity.

I'm looking forward to the stacked layers of agent + environment definitions that will really explode the ways we can interact with AI.

I'm working on a project to use GPT agents/scenarios for smart contract arbitration. i.e judging contests, civil disputes, etc.

[+] user-|3 years ago|reply
I cant seem to post code into it, otherwise this is my favouite UI so far.
[+] jlowin|3 years ago|reply
Unfortunately multiline inputs are still tricky but when Textual supports them, we’ll add them.
[+] marcusbuffett|3 years ago|reply
As someone who loves to stay in the terminal when possible, this looks awesome. The lack of multi-line input is a bit of a bummer but I guess I can c+p from my editor or something
[+] phat3|3 years ago|reply
I played around with the idea of creating a TUI for chatGPT as well but I gave up because of the lack of multi line support in textualize. I created a REPL instead using rich. If you wanna give it a shot you can find it at https://github.com/Phat3/LLM-Repl
[+] jlowin|3 years ago|reply
Thanks! And I agree - as soon as Textual has multi-line inputs, we'll include them.
[+] mymountaingoat|3 years ago|reply
what i really want to find is a dead simple program simple chatgpt api interfaces that let you store conversations on the filesystem such that they can be easily resumed.

like it just reads from stdin, parses an optional header data, and then parses alternating user and system messages. such that you can just pipe the whole shebang in from any text editor and get the same thing back with output appended

[+] kulikalov|3 years ago|reply
Could you describe a scenario where you need to resume a conversation?