top | item 39924830

(no title)

fnetisma | 1 year ago

This is really neat! I have questions:

“Needs tool usage” and “found the answer” blocks in your infra, how are these decisions made?

Looking at the demo, it takes a little time to return results, from the search, vector storage and vector db retrieval, which step takes the most time?

discuss

nilsherzig|1 year ago

Thanks :)

Die LLM makes these decisions on its own. If it writes a message which contains a tool call (Action: Web search Action Input: weight of a llama) the matching function will be executed and the response returned to the LLM. It's basically chatting with the tool.

You can toggle the log viewer on the top right, to get more detail on what it's doing and what is taking time. Timing depends on multiple things: - the size of the top n articles (generating embeddings for them takes some time) - the amount of matching vector DB responses (reading them takes some time)

dcreater|1 year ago

> Die LLM

You mean the? The German is bleeding through haha