top | item 46898341

(no title)

johaugum | 25 days ago

Skimmed the repo, this is basically the irreducible core of an agent: small loop, provider abstraction, tool dispatch, and chat gateways . The LOC reduction (99%, from 400k to 4k) mostly comes from leaving out RAG pipelines, planners, multi-agent orchestration, UIs, and production ops.

discuss

baby|25 days ago

RAG seems odd when you can just have a coding agent manage memory by managing folders. Multi agent also feels weird when you have subagents.

simonw|25 days ago

Yeah, vector embeddings based RAG has fallen out of fashion somewhat.

It was great when LLMs had 4,000 or 8,000 token context windows and the biggest challenge was efficiently figuring out the most likely chunks of text to feed into that window to answer a question.

These days LLMS all have 100,000+ context windows, which means you don't have to be nearly as selective. They're also exceptionally good at running search tools - give them grep or rg or even `select * from t where body like ...` and they'll almost certainly be able to find the information they need after a few loops.

Vector embeddings give you fuzzy search, so "dog" also matches "puppy" - but a good LLM with a search tool will search for "dog" and then try a second search for "puppy" if the first one doesn't return the results it needs.

rando77|25 days ago

I've been leaning towards multi agent because sub agent relies on the main agent having all the power and using it responsibly.

PlatoIsADisease|25 days ago

Interesting.

I guess RAG is faster? But I'm realizing I'm outdated now.

antirez|25 days ago

Totally useless indeed.

naasking|25 days ago

Unless I'm misunderstanding what they are, planners seem kind of important.

johaugum|25 days ago

As you mentioned, that depends on what you mean by planners.

An LLM will implicitly decompose a prompt into tasks and then sequentially execute them, calling the appropriate tools. The architecture diagram helpfully visualizes this [0]

Here though, planners means autonomous planners that exist as higher level infrastructure, that does external task decomposition, persistent state, tool scheduling, error recovery/replanning, and branching/search. Think a task like “Prompt: “Scan repo for auth bugs, run tests, open PR with fixes, notify Slack.” that just runs continuously 24/7, that would be beyond what nanobot could do. However, something like “find all the receipts in my emails for this year, then zip and email them to my accountant for my tax return” is something nanobot would do.

[0] https://github.com/HKUDS/nanobot/blob/main/nanobot_arch.png

skybrian|25 days ago

I don’t know what these planners do, but I’ve had reasonably good luck asking a coding agent to write a design doc and then reviewing it a few times.

m00dy|25 days ago

RAG is broken when you have too much data.

plingamp|25 days ago

Specifically when the document number reaches around 10k+, a phenomenon called "Semantic Collapse" occurs.

https://dho.stanford.edu/wp-content/uploads/Legal_RAG_Halluc...

PlatoIsADisease|25 days ago

Cant you make thresholds higher?

Hmm... I guess not, you might want all that data.

Super interesting topic. Learning a lot.

thunky|25 days ago

Gemini with Google search is RAG using all public data, and it isn't broken.