(no title)
melony | 1 year ago
- cache the prompting somehow, unless you are doing dynamic stuff with the prompts, the language embeddings generated should be static (this depends on the architecture of the model that you are using, it's only possible with certain setups where the language processing is a separate part in the pipeline)
- consider fine-tuning an img to img model with your current outputs instead of using a language-coupled model. My intuition is that this is currently significantly over-engineered on the ML side.
- Play around with local hardware acceleration instead of sending everything to the cloud, you also probably don't need particularly high resolution for the images either.
cataPhil|1 year ago