top | item 39977466 (no title) milansuk | 1 year ago This is an implementation of a transformer and in README it's presented as text->text. Tokens are just integers going in and out.Is it possible to use it to train other types of LLMs(text->image, image->text, speech->text, etc.)? discuss order hn newest _giorgio_|1 year ago Yes, anything can be an input token.Patch of pixels ---> token Fragment of input Audio ---> token etc bootsmann|1 year ago The transformer itself just takes arrays of numbers and turns them into arrays of numbers. What you are interested in is the process that happens before and after the transformer.
_giorgio_|1 year ago Yes, anything can be an input token.Patch of pixels ---> token Fragment of input Audio ---> token etc
bootsmann|1 year ago The transformer itself just takes arrays of numbers and turns them into arrays of numbers. What you are interested in is the process that happens before and after the transformer.
_giorgio_|1 year ago
Patch of pixels ---> token Fragment of input Audio ---> token etc
bootsmann|1 year ago