top | item 43487176

(no title)

smahs | 11 months ago

No it won't (most likely). VTracer (which the authors compare with) is fast, runs in browser via wasm, consumes way less resources and can even convert natural images very decently. But the model seems cool for the usecase of prompt to logo or icon (over my current workflow of getting a jpg from flux and passing it through VTracer). I hope someone over at llama.cpp notices this (at least for the text-to-svg usecase, if not multimodal).

discuss

chris-tsang|10 months ago

Author of VTracer here. Finally being able to comment on hackernews before the thread got locked.

Would be interested in learning about your workflow. Is it a logo generation app?

I feel like this is an example of "Machine learning is eating software". Raster to vector conversion is a perfect problem, because we can generate dataset of infinite sizes and can easily validate them with vectorize-rasterize roundtrips.

I did have an idea of performing tracing iteratively. Basically by adjusting the output SVG bit-by-bit until it matches the original image within a certain margin of error. And optimizing the output size of the SVG by simplifying curves if it does not degrade the quality. But VTracer in its current state is oneshot and probably uses 1/100 of the computational resources.

VTracer seems to perform badly on all the examples. I suspect it can be drastically improved simply by upscaling the image (via traditional interpolation, or machine learning based) and picking different parameters. But I am glad that it was cited!

smahs|10 months ago

Thanks for noticing this, and yes I have also noticed what you're pointing out, but workable for many use cases. I use this workflow for making images for marketing or web (so images are more artistic than photo realistic generations to begin with). Think of stuff you can find on undraw, but generated by image models from prompts. Then run them through VTracer. The reproductions are not perfect, but are often good enough (can be slow depending on how sharp you want the curves, and often very large file sizes as you mentioned). Then make any changes in inkscape and convert back to raster for publishing.

> logo generation app

For logo generation, I would actually prefer code gen. I thought of this problem when reading about the diffusion language models recent (if there is lots of training data available in form of text-vector-raster triplets).