top | item 40750034

(no title)

sc077y | 1 year ago

Damn I built a RAG agent during the past 3 months and a half for my internship. And literally everyone in my company was asking me why I wasn't using llangchain or llamaindex like I was a lunatic. Everyone else that built a rag in my company used llangchain, one even went into prod.

I kept telling them that it works well if you have a standard usage case but the second you need to something a little original you have to go through 5 layers of abstraction just to change a minute detail. Furthermore, you won't really understand every step in the process, so if any issue arises or you need to be improve the process you will start back at square 1.

This is honestly such a boost of confidence.

discuss

w4|1 year ago

I had a similar experience when LangChain first came out. I spent a good amount of time trying to use it - including making some contributions to add functionality I needed - but ultimately dropped it. It made my head hurt.

Most LLM applications require nothing more than string handling, API calls, loops, and maybe a vector DB if you're doing RAG. You don't need several layers of abstraction and a bucketload of dependencies to manage basic string interpolation, HTTP requests, and for/while loops, especially in Python.

On the prompting side of things, aside from some basic tricks that are trivial to implement (CoT, in-context learning, whatever) prompting is very case-by-case and iterative, and being effective at it primarily relies on understanding how these models work, not cargo-culting the same prompts everyone else is using. LLM applications are not conceptually difficult applications to implement, but they are finicky and tough to corral, and something like LangChain only gets in the way IMO.

danenania|1 year ago

I haven't used LangChain, but my sense is that much of what it's really helping people with is stream handling and async control flow. While there are libraries that make it easier, I think doing this stuff right in Python can feel like swimming against the current given its history as a primarily synchronous, single-threaded runtime.

I built an agent-based AI coding tool in Go (https://github.com/plandex-ai/plandex) and I've been very happy with that choice. While there's much less of an ecosystem of LLM-related libraries and frameworks, Go's concurrency primitives make it straightforward to implement whatever I need, and I never have to worry about leaky or awkward abstractions.

jackmpcollins|1 year ago

I completely agree, and built magentic [0] to cover the common needs (structured output, common abstraction across LLM providers, LLM-assisted retries) while leaving all the prompts up to the package user.

[0] https://github.com/jackmpcollins/magentic

hobs|1 year ago

Groupthink is really common among programmers, especially when they have no idea what they are talking about. It shows you don't need a lot of experience to see the emperor has no clothes, but you do need to pay attention.

jacobsimon|1 year ago

I admire what the Langchain team has been building toward even if people don’t agree with some of their design choices.

The OpenAI api and others are quite raw, and it’s hard as a developer to resist building abstractions on top of it.

Some people are comparing libraries like Langchain to ORMs in this conversation, but I think maybe the better comparison would be web frameworks. Like, yeah the web/HTML/JSON are “just text” too, but you probably don’t want to reinvent a bunch of string and header parsing libraries every time you spin up a new project.

Coming from the JS ecosystem, I imagine a lot of people would like a lighter weight library like Express that handles the boring parts but doesn’t get in the way.

siva7|1 year ago

Matches my experience as well. I tried langchain about a year ago for an app and had a pretty standard use case but even going a little bit of rail and i had to dig up layers of abstractions where it would have been much easier just using the original openai lib. So it might be beneficial if your use case is about offering many different LLM providers in your app but if you know you won't be swapping out the LLM provider soon it's usually better to not use such frameworks.

ramoz|1 year ago

Wise perspective from an intern. The type of pragmatism we love.

weakfish|1 year ago

I wish I was this pragmatic as an intern.

ianschmitz|1 year ago

Way to follow your instinct.

I ran into similar limitations for relatively simple tasks. For example I wanted access to the token usage metadata in the response. This seems like such an obvious use case. This wasn’t possible at the time, or it wasn’t well documented anyway.

tkellogg|1 year ago

I've had the same experience. I thought I was the weird one, but, my god, LangChain isn't usable beyond demos. It feels like even proper logging is pushing it beyond it's capabilities.

felixfbecker|1 year ago

On top of that, if you use the TypeScript version, the abstractions are often... weird. They feel like verbatim ports of the Python implementations. Many things are abstracted in ways that are not very type-safe and you'd design differently with type safety in mind. Some classes feel like they only exist to provide some structure in a language without type safety (Python) and wouldn't really need to exist with structural type checking.

paraph1n|1 year ago

Could someone point me towards a good resource for learning how to build a RAG app without llangchain or llamaindex? It's hard to find good information.

turnsout|1 year ago

At a fundamental level, all you need to know is:

- Read in the user's input

- Use that to retrieve data that could be useful to an LLM (typically by doing a pretty basic vector search)

- Stuff that data into the prompt (literally insert it at the beginning of the prompt)

- Add a few lines to the prompt that state "hey, there's some data above. Use it if you can."

kolinko|1 year ago

You can start by reading up about how embeddings work, then check out specific rag techniques that people discovered. Not much else is needed really.

krawczstef|1 year ago

Here's a blog post that I just pushed that doesn't use them at all - https://blog.dagworks.io/p/building-a-conversational-graphdb (we have more on our blog - search for RAG).

[disclaimer I created Hamilton & Burr - both whitebox frameworks] See https://www.reddit.com/r/LocalLLaMA/comments/1d4p1t6/comment... for comment about Burr.

verdverm|1 year ago

My strategy has been to implement in / follow along with llamaindex, dig into the details, and then implement that in a less abstracted, easily understandable codebase / workflow.

Was driven to do so because it was not as easy as I'd like to override a prompt. You can see how they construct various prompts for the agents, it's pretty basic text/template kind of stuff

d13|1 year ago

This is fun and interesting:

https://developers.cloudflare.com/workers-ai/tutorials/build...

sveinek|1 year ago

Data centric on YouTube has some great videos . https://youtube.com/@data-centric?si=EOdFjXQ4uv02J774

fsndz|1 year ago

check this: https://www.lycee.ai/blog/rag-fastapi-postgresql-pgvector

bestcoder69|1 year ago

openai cookbook! Instructor is a decent library that can help with the annoying parts without abstracting the whole api call - see it’s docs for RAG examples.

puppymaster|1 year ago

you are heading the right direction. It's amazing to see seasoned engineers go through the mental gymnastic of justifying installing all those dependencies and arguing about vector db choices when the data fit in ram and the swiss knife is right there: np.array

joseferben|1 year ago

impressive to decide against something as shiny as langchain as intern

moneywoes|1 year ago

Any tutorials you follow?