It starts with the fundamentals of how backpropagation works then advances to building a few simple models and ends with building a GPT-2 clone. It won't taech you everything about AI models but it gives you a solid foundation for branching out.
The most valuable tutorial will be translating from the paper itself. The more hand holding you have in the process, the less you'll be learning conceptually. The pure manipulation of matrices is rather boring and uninformative without some context.
I also think the implementation is more helpful for understanding the engineering work to run these models that getting a deeper mathematical understanding of what the model is doing.
jwitthuhn|2 months ago
It starts with the fundamentals of how backpropagation works then advances to building a few simple models and ends with building a GPT-2 clone. It won't taech you everything about AI models but it gives you a solid foundation for branching out.
roadside_picnic|2 months ago
I also think the implementation is more helpful for understanding the engineering work to run these models that getting a deeper mathematical understanding of what the model is doing.