top | item 39359051

(no title)

McAtNite | 2 years ago

I’m struggling to understand the point of this. It appears to be a more simplified way of getting a local LLM running on your machine, but I expect less technically inclined users would default to using the AI built into Windows while the more technical users will leverage llama.cpp to run whatever models they are interested in.

Who is the target audience for this solution?

discuss

operator-name|2 years ago

This is a tech demo for TensorRT, which is ment to greatly improve inference time for compatible models.

brucethemoose2|2 years ago

> the more technical users will leverage llama.cpp to run whatever models they are interested in.

Llama.cpp is much slower, and does not have built-in RAG.

TRT-LLM is a finicky deployment grade framework, and TBH having it packaged into a one click install with llama index is very cool. The RAG in particular is beyond what most local LLM UIs do out-of-the-box.

dkarras|2 years ago

>It appears to be a more simplified way of getting a local LLM running on your machine

No, it answers questions from the documents you provide. Off the shelf local LLMs don't do this by default. You need a RAG stack on top of it or fine tune with your own content.

westurner|2 years ago

From "Artificial intelligence is ineffective and potentially harmful for fact checking" (2023) https://news.ycombinator.com/item?id=37226233 : pdfgpt, knowledge_gpt, elasticsearch :

> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?

> How does search compare to generating things with citations?

pdfGPT: https://github.com/bhaskatripathi/pdfGPT :

> PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities.

GH "pdfgpt" topic: https://github.com/topics/pdfgpt

knowledge_gpt: https://github.com/mmz-001/knowledge_gpt

From https://news.ycombinator.com/item?id=39112014 : paperai

neuml/paperai: https://github.com/neuml/paperai :

> Semantic search and workflows for medical/scientific papers

RAG: https://news.ycombinator.com/item?id=38370452

Google Desktop (2004-2011): https://en.wikipedia.org/wiki/Google_Desktop :

> Google Desktop was a computer program with desktop search capabilities, created by Google for Linux, Apple Mac OS X, and Microsoft Windows systems. It allowed text searches of a user's email messages, computer files, music, photos, chats, Web pages viewed, and the ability to display "Google Gadgets" on the user's desktop in a Sidebar

GNOME/tracker-miners: https://gitlab.gnome.org/GNOME/tracker-miners

src/miners/fs: https://gitlab.gnome.org/GNOME/tracker-miners/-/tree/master/...

SPARQL + SQLite: https://gitlab.gnome.org/GNOME/tracker-miners/-/blob/master/...

https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; promptfoo, chainforge, mixtral

fortran77|2 years ago

It seems really clear to me! I downloaded it, pointed it to my documents folder, and started running it. It's nothing like the "AI built into Windows" and it's much easier than dealing with rolling my own.

SirMaster|2 years ago

This lets you run Mistral or Llama 2, so whomever has an RTX card and wants to run either of those models?

And perhaps they will add more models in the future?

pquki4|2 years ago

I don't think your comment answers the question? Basically, those who bother to know underlying model's name can already run their model without this tool from nvidia?

McAtNite|2 years ago

I suppose I’m just struggling to see the value add. Ollama already makes it dead simple to get a local LLM running, and this appears to be a more limited vendor locked equivalent.

From my point of view the only person who would be likely to use this would be the small slice of people who are willing to purchase an expensive GPU, know enough about LLMs to not want to use CoPilot, but don’t know enough about them to know of the already existing solutions.

papichulo2023|2 years ago

Does windows uses the pc's gpu or just cpu or cloud?

robotnikman|2 years ago

If they are talking about the Bing AI, just using whatever OpenAI has in the cloud

joenot443|2 years ago

The immediate value prop here is the ability to load up documents to train your model on the fly. 6mos ago I was looking for a tool to do exactly this and ended up deciding to wait. Amazing how fast this wave of innovation is happening.

seydor|2 years ago

Windows users who haven't bought an Nvidia card yet