top | item 38639896

(no title)

IshanMi | 2 years ago

Focusing on Deep Learning specifically: - Most LLMs currently use the transformer architecture. You can learn about this visually (https://bbycroft.net/llm), or through this blog post (https://jalammar.github.io/illustrated-transformer/), or through any number of Andrej Karpathy's blog posts and materials. - To stay on top of papers that get published every week, I read a summary every Sunday: https://github.com/dair-ai/ML-Papers-of-the-Week - To learn more about the engineering side of it, you can join Discord servers such as EleutherAI's, or follow GitHub discussions of projects like llama.cpp

Personally I think the best way to develop per unit time is probably to try to re-implement some of the big papers in the field. There's a clear goal, there are clear signs of success, there are many implementations out there for you to check your work against and compare and learn from.

Good luck!

discuss

order