Llama3.2 vision model and nosiy images
1 points| epop | 1 year ago
However, the model struggles with noisy images—those with stamps, handwritten annotations, and other artifacts—and simply fails to produce any output. I attempted to fine-tune it using 40k images augmented with noise (including quantization noise, salt and pepper noise, skewing, handwritten text, and multiple fonts). Unfortunately, this reduced its accuracy on well-formatted images, and it still doesn’t handle noisy images effectively.
What might I be missing here?
No comments yet.