(no title)
NAR8789 | 1 month ago
Reference image from the article: https://cannoneyed.com/img/projects/isometric-nyc/training_d...
You have to zoom in, but here the inputs on the left are mixed pixel art / photo textures. The outputs on the right are seamless pixel art.
Later on he talks about 2x2 squares of four tiles each as input and having trouble automating input selection to avoid seams. So with his 512x512 tiles, he's actually sending in 1024x1024 inputs. You can avoid seams if every new tile can "see" all its already-generated neighbors.
You get a seam if you generate a new tile next to an old tile but that old tile is not input to the infill agorithm. The new tile can't see that boundary, and the style will probably not match.
cannoneyed|1 month ago
More interestingly, not even the biggest smartest image models can tell if a seam exists or not (likely due to the way they represent image tokens internally)
NAR8789|1 month ago
skerit|1 month ago
Maybe to process the Nano-Banana generated dataset before fine-tuning, and then also to fix the generated Qwen output?
abrookewood|1 month ago