With the latest Microsoft Word, if you open a PDF that is a scanned image of a document and convert it to Word format, it does a pretty decent job of not only OCR (optical character recognition) but also picking matching fonts for various sections.
I just tested this with my internet connection disabled and it still worked. Since it's doing local processing, I suspect it uses traditional OCR algorithms rather than LLMs.
As the article concludes, LLMs aren't magic, they're just one useful tool to include in your toolbox.
It's pretty easy to imagine an evolved mess of an open ad hoc but broadly adopted ecosystem where LLM are surrounded by a bewildering array of Node-like domain-specific extensions.
Security concerns aside (...) that sounds pretty useful.
How close are the wrong guesses? Fonts are fairly incestuous because the shapes of the characters themselves can't be copyrighted (only the code), so there are sometimes dozens of clones of very similar fonts... especially on a free site like dafont
Op here. I tried what the font a bit but didn't mention it in the article. I didn't get good results with it. Although it's probably a good idea to ask it for a guess, and feed that to the LLM too.
I'd be curious how much better a more expensive LLM would do - gpt-4o-mini and gemini-2.5-flash-preview-05-20 are definitely not the most capable LLMs one could have chosen.
I would say there’s a good chance they could be one-offs created by whoever was doing the ad. If you’re paying an artist, having them do the lettering could certainly be cheaper than licensing a font for the purpose (or developing a font that’ll never be used outside of one, or a series, of ads).
My fellow designer friends would often do this. But they would start from actual fonts and do slight (or more than slight) adjustments to them to match what they wanted as an outcome.
Results here are bad, obviously, but it'll be interesting when LLMs can not just identify fonts but unredact pieces of documents in places where just a few words are removed by analyzing the length of redaction, combos of letters that fit into it, and the context.
Why would you even need LLMs for that? Notwithstanding context, finding text that fits into a given bounding box is already perfectly doable via a classical algorithm (in this case e.g. based on dynamic programming).
I recently tried to identify a font from a screenshot of an ad and used everything I could find, from WhatTheFont to LLMs. The LLMs were hopeless at identifying the font from the screenshot, but ChatGPT eventually led me to the correct result after I threw away the image and started describing the font in plain text: monospacing, a dot in the middle of the 0, and (presumably) wide usage. It turned out to be Ubuntu Mono. It was surprising that so many obscure fonts were suggested, none of which were even a reasonably close match, while Ubuntu Mono was completely overlooked.
What makes us think font information made it into the traning set atll, rather than something more along the line of "all chars that look like this one are to be interpreted as 'a'". doesnt need to provide font name for it.
I suspect that few professional (paid for) adverts use any fonts from dafont.com, and many fonts would anyway be unavailable to ordinary users. The current font recogniser programs are usually trained on commercially available fonts
Every time I turn around these days I encounter someone ready to use an infinite amount of energy that is being paid for by other people, to 'simulate' some analog process by temporarily taking the reins of some data center that is burning megawatts of energy. We are being given the reins for 0.5 seconds but very soon the horse will gallop away unless we have a lot of money to spend.
[+] [-] StellarScience|7 months ago|reply
I just tested this with my internet connection disabled and it still worked. Since it's doing local processing, I suspect it uses traditional OCR algorithms rather than LLMs.
As the article concludes, LLMs aren't magic, they're just one useful tool to include in your toolbox.
[+] [-] aaroninsf|7 months ago|reply
Security concerns aside (...) that sounds pretty useful.
[+] [-] micromacrofoot|7 months ago|reply
[+] [-] Rastonbury|7 months ago|reply
[+] [-] she46BiOmUerPVj|7 months ago|reply
https://www.myfonts.com/pages/whatthefont
[+] [-] pwython|7 months ago|reply
https://www.dafont.com/forum/read/522670/font-identification
[+] [-] Lemaxoxo|7 months ago|reply
[+] [-] Doohickey-d|7 months ago|reply
[+] [-] double051|7 months ago|reply
I agree that using the frontier models would be much more interesting.
[+] [-] Workaccount2|7 months ago|reply
[+] [-] smallerize|7 months ago|reply
E.g. https://www.dafont.com/forum/read/569491/taylor-swift-font-p...
[+] [-] rubyn00bie|7 months ago|reply
[+] [-] k3liutZu|7 months ago|reply
[+] [-] elicash|7 months ago|reply
[+] [-] lblume|7 months ago|reply
[+] [-] mopsi|7 months ago|reply
[+] [-] larodi|7 months ago|reply
[+] [-] cormullion|7 months ago|reply
[+] [-] gdudeman|7 months ago|reply
It’s quite likely LLMs don’t “know” the fonts in the dataset, but they could figure many of them out.
[+] [-] aaron695|7 months ago|reply
[deleted]
[+] [-] HocusLocus|7 months ago|reply
Every time I turn around these days I encounter someone ready to use an infinite amount of energy that is being paid for by other people, to 'simulate' some analog process by temporarily taking the reins of some data center that is burning megawatts of energy. We are being given the reins for 0.5 seconds but very soon the horse will gallop away unless we have a lot of money to spend.
[+] [-] qezz|7 months ago|reply