Yes, the concept is still powerful and in use today.
As I understand the RLHF method of training LLMs, this involves the creation of an internal "reward model" which is a secondary model that is trained to try to predict the score of an arbitrary generation. This feels very analogous to the "discriminator" half of a GAN, because they both critique the generation created by the other half of the network, and this score is fed back in to train the primary network through positive and negative rewards.
I'm sure it's an oversimplification, but RLHF feels like GANs applied to the newest generation of LLMs -- but I rarely hear people talk about it in these terms.
I think diffusion models are useful too, I’m currently working on a project to use them to generate medical type data. It seems they'd both be useful as they are both targeted towards generation of data, especially in areas where data is hard to come by. Doing this blog made me wonder of the application in finance too.
I agree -- I would love to see diffusion models applied to more types of data. I would love to see more experiments done with text generation using a diffusion model, because it would have an easier time looking at the "whole text" rather than the myopia that can occur from simple next-token prediction.
rgovostes|1 year ago
The examples they give are for eye and hand tracking -- which not coincidentally are used for navigating the Apple Vision Pro user interface.
https://machinelearning.apple.com/research/gan
Two_hands|1 year ago
HanClinto|1 year ago
As I understand the RLHF method of training LLMs, this involves the creation of an internal "reward model" which is a secondary model that is trained to try to predict the score of an arbitrary generation. This feels very analogous to the "discriminator" half of a GAN, because they both critique the generation created by the other half of the network, and this score is fed back in to train the primary network through positive and negative rewards.
I'm sure it's an oversimplification, but RLHF feels like GANs applied to the newest generation of LLMs -- but I rarely hear people talk about it in these terms.
Two_hands|1 year ago
HanClinto|1 year ago
GaggiX|1 year ago
eru|1 year ago