The author seems to be of the opinion that the creators of DeepSeek will either be unable to, or will not see the value of optimizing the 'second stage' RL component of the 'new' (post pre-training RL) way of training frontier foundation models. Every competent programmer in China is now looking for low level ptx optimizations for EVERY SINGLE STAGE of the pipeline. They will now, likely not publish any of it.
astrange|1 year ago