top | item 42492287

AI Computer Interface (ACI)

3 points| itstomo | 1 year ago | reply

Hi Hacker News,

What do you think about "AI-Computer Interface"?

Let me explain a bit. "Human-computer interface (HCI)", was first developed around WWII. Since then it evolved so much and today we have very sophisticated and intuitive interfaces.

It's the AI era, and there are so many projects and products trying to automate various tasks that humans are doing. However, main approach is to try to adapt their LLM-based products to current HCI, which is developed for humans, not LLMs. This approach actually works with some workaround, such as taking screenshots to understand what actions are available, etc.

Example project: https://github.com/OpenInterpreter/open-interpreter

I wonder if creating UI specifically for LLMs/AIs instead of adapting LLMs into human-friendly UI can make it possible to achieve more efficient automation processes.

Do you know any of such projects or products?

Happy holidays!

2 comments

[+] mindcrime|1 year ago|reply

Not sure I follow you. If you're building a computer system where the "user" is intended to be another computer, why would you need anything besides an API? Or are you talking about a system where both humans and other computers are meant to consume it's services? If so, that would be somewhat interesting I guess.

But it seems to me that if you're going to use an LLM to "use" some other software, the way to go is use tool-calling support to call an API, and/or something like Anthropic's MCP (Model Context Protocol) stuff. There's some exiting work to, around "agent to agent" communications that one could use to integrate one kind of AI system with another computer system (whether or not the other system has any AI abilities). They range from things like FIPA-ACL, KQML, KIF, etc., through all the SemanticWeb standards, to some more recent specs that are being worked on. For example, the forthcoming ECMA TC56[1][2] standard for Natural Language communication between agents. And a similar'ish effort is mentioned in a recent arXiv paper[3].

[1]: https://ecma-international.org/technical-committees/tc56/

[2]: https://github.com/nlip-project

[3]: https://arxiv.org/abs/2411.05828v1

[+] itstomo|1 year ago|reply

Hi, thank you for your feedback.

Tool calling is very helpfull when humans can manually define those functions. However, it's also very limited because humans have to define those functions manually unless the function is very simple without any external resources such as accessing an external API with API keys, etc.

Someday, we'll have AI agents that work like our secretaries or employees. But just like a CEO doesn't have to know how to use all the apps that his employees use, some apps would be created only for AI agents. In such a scenario, the interface of these apps would look very different from HCI. For example, they most likely wouldn't use vision or sound as a part of the main UI, unlike HCI.

Interestingly, most apps can be re-rewritten as a Decision Tree. If you push this button you'll be navigated to this page, where a different list of actions is available, etc. I think an AI-Computer Interface if it is ever invented, might look like a text-based Decision Tree without vision or sound unless they are absolutely necessary.