top | item 46361914

(no title)

holtkam2 | 2 months ago

I agree and disagree. In my day job as an AI engineer I rarely if ever need to use any “classic” deep learning to get things done. However, I’m a firm believer that understanding the internals of a LLM can set you apart as an gen AI engineer, if you’re interested in becoming the top 1% in your field. There can and will be situations where your intuition about the constraints of your model is superior compared to peers who consider the LLM a black box. I had this advice given directly to me years ago, in person, by Clem Delangue of Hugging Face - I took it seriously and really doubled down on understanding the guts of LLMs. I think it’s served me well.

I’d give similar advice to any coding bootcamp grad: yes you can get far by just knowing python and React, but to reach the absolute peak of your potential and join the ranks of the very best in the world in your field, you’ll eventually want to dive deep into computer architecture and lower level languages. Knowing these deeply will help you apply your higher level code more effectively than your coding bootcamp classmates over the course of a career.

discuss

libraryofbabel|2 months ago

I suppose I actually agree with you, and I would give the same advice to junior engineers too. I've spent my career going further down the stack than I really needed to for my job and it has paid off: everything from assembly language to database internals to details of unix syscalls to distributed consensus algorithms to how garbage collection works inside CPython. It's only useful occasionally, but when it is useful, it's for the most difficult performance problems or nasty bugs that other engineers have had trouble solving. If you're the best technical troubleshooter at your company, people do notice. And going deeper helps with system design too: distributed systems have all kinds of subtleties.

I mostly do it because it's interesting and I don't like mysteries, and that's why I'm relearning transformers, but I hope knowing LLM internals will be useful one day too.

MIA_Alive|2 months ago

Wouldn't you say that people who pursue deep architectural knowledge should just go down the AI Researcher career track? I feel like that's where that sort of knowledge actualy matters.