top | item 39977466

(no title)

milansuk | 1 year ago

This is an implementation of a transformer and in README it's presented as text->text. Tokens are just integers going in and out.

Is it possible to use it to train other types of LLMs(text->image, image->text, speech->text, etc.)?

discuss

_giorgio_|1 year ago

Yes, anything can be an input token.

Patch of pixels ---> token Fragment of input Audio ---> token etc

bootsmann|1 year ago

The transformer itself just takes arrays of numbers and turns them into arrays of numbers. What you are interested in is the process that happens before and after the transformer.