(no title)
hapanin | 2 years ago
Paper reference / main takeaways / link
instructGPT / main concepts of instruction tuning / https://proceedings.neurips.cc/paper_files/paper/2022/hash/b...
self-instruct / bootstrap off models own generations / https://arxiv.org/pdf/2212.10560.pdf
Alpaca / how alpaca was trained / https://crfm.stanford.edu/2023/03/13/alpaca.html
Llama 2 / probably the best chat model we can train on, focus on training method. / https://arxiv.org/abs/2307.09288
LongAlpaca / One of many ways to extend context, and a useful dataset / https://arxiv.org/abs/2309.12307
PPO / important training method / idk just watch a youtube video
Obviously these are specific to my work and are out of date by ~3-4 months but I think they do capture the spirit of "how do we train LLMs on a single GPU and no annotation team" and are frequently referenced simply by what I put in the "paper reference" column.
MacsHeadroom|2 years ago
sa-code|2 years ago
https://arxiv.org/abs/2203.15556
thatguysaguy|2 years ago
It's extremely zeitgeisty atm