top | item 44641120

(no title)

pilooch | 7 months ago

True but modern models such as gemma3 pan& scan and other tricks such as training from multiple resolutions do alleviate these issues.

An interesting property of the gemma3 family is that increasing the input image siwmze actually does not increase processing memory requirements, because a second stage encoder actually compresses it into fixed size tokens. Very neat in practice.

discuss

order

No comments yet.