It's honing in on equations without getting distracted by nearby Hanzi or Cyrillic, or even pictures of dogs. Wow.
I keep going back to dig through your resources and getting more impressed.
EDIT: I guess my only constructive criticism is that you should brag more. I like a simple landing page, but I think you've earned a short list of examples of corner cases you tackle well, if the whole API is packed into that free app, because they're really impressive.
The perfection in those examples makes me suspect that they are cherry-picked or part of the training data. Especially the handwritten text is not always clear and could reasonably be interpreted differently. I'd expect a machine-learning model to get at least some things wrong some of the time.
If I wanted to use this in an application, I'd definitely want to see some accuracy figures on validation data as well as a few failure cases to see whether the output remains reasonable even when it is wrong.
This is really awesome OP! Thank you for sharing :)
One note I should make: it was not entirely clear (to me) upon a cursory view of the website, that the purpose of mathpix was to convert handwritten text into LaTeX. For some reason (maybe my coffee hasn't kicked in yet) I thought this was strictly intended to take screenshots of equations on an existing pdf document or a website etc and that will be converted to LaTeX.
My thought at that point was "I wonder if they could do this for handwritten text" and then I looked at the docs and facepalmed..
Stupid question, how well would this work on a PDF of a latex document?
This would be great for blind people, as pdfed latex is extremely non-accessable, and I have to email authors of papers to get the original latex from them, which is often lost.
> This would be great for blind people, as pdfed latex is extremely non-accessable, and I have to email authors of papers to get the original latex from them, which is often lost.
Many years ago I "translated" course materials into a form which was accessible to a blind grad student. It was a really interesting job and taught me a lot about accessibility.
I was effectively doing latex, but without all the leading \ characters. It made learning latex comparatively easy.
What interface do you use to read equations? Screen reader speaking the straight latex, or do you have some Middleware to make it more digestible when listened to?
Im curious if the developers are fans of The Big Bang Theory TV series? They were using a smart phone app...and of course was less useful due to it being fiction...
Suggestion: instead of making me download a pdf to see examples of what the results look like, maybe put them on the page directly. You can have a couple. Then put the details in the pdf.
Bug report: it appears that multiline summation subscripts are not recognized correctly. For example, Eq. 8 of [1]. These are often created using \substack as part of amsmath.
I assume you just got a lot of installs from India, because the large publishing houses contract out many re-typesetting jobs that are basically to take scans of technical texts and convert them back into LaTeX.
I strongly suggest you talk to the publishers about integrating your tech into their TeX.
Want a math-ish PDF and some LaTeX source for training on possible edge cases? Think I might get someone (or something) to read my dissertation this way...
Any way you could make this available outside the Mac App store? Apple seems to have decided I did something horrible and unforgivable by moving to a different country after creating an account, thus making it impossible for me to use the store.
We used their API to make a simple screenshot2latex tool (select screen region -> puts latex formula in clipboard). From my experience it still fails on a couple of fairly common things like:
I was looking for an API that provides math OCR. Great, going to integrate it into our app soon :-) Let me know if you want to add us to your "trusted by" section.
Mathematicians use operator overloading all the time. It would be nice to have a tool that explains to me what an equation actually means in a given context.
You are talking about the semantics of an equation while this tool is already satisfying when understanding correctly the syntax (in LaTeX).
There are actually a number of ongoing research projects to establish standards of semantical mathematical representations. Probably one of the best funded running projects (budget ~10MEUR) which has a work package on this topic is http://opendreamkit.org/ . Work is going on at https://mathhub.info/ from my knowledge. I would like to provide a deep link but the site seems to be in a broken state. Apparently people are working on it right in the moment.
This is insanely impressive. Great work. Wish tools like this existed when I was still in school...almost makes me want to go back and do some more math :)
[+] [-] brownbat|8 years ago|reply
https://tex.stackexchange.com/questions/1443/what-is-the-sta...
Under API... you're already doing handwriting? This is uh, nontrivial work to say the least. Really impressive.
The endorsements are a nice touch. :)
Made me really curious how far the system goes, what cases break it.
Oh... nevermind. You have a PDF of examples here: https://docs.mathpix.com
It's honing in on equations without getting distracted by nearby Hanzi or Cyrillic, or even pictures of dogs. Wow.
I keep going back to dig through your resources and getting more impressed.
EDIT: I guess my only constructive criticism is that you should brag more. I like a simple landing page, but I think you've earned a short list of examples of corner cases you tackle well, if the whole API is packed into that free app, because they're really impressive.
[+] [-] yorwba|8 years ago|reply
If I wanted to use this in an application, I'd definitely want to see some accuracy figures on validation data as well as a few failure cases to see whether the output remains reasonable even when it is wrong.
[+] [-] sinab|8 years ago|reply
One note I should make: it was not entirely clear (to me) upon a cursory view of the website, that the purpose of mathpix was to convert handwritten text into LaTeX. For some reason (maybe my coffee hasn't kicked in yet) I thought this was strictly intended to take screenshots of equations on an existing pdf document or a website etc and that will be converted to LaTeX.
My thought at that point was "I wonder if they could do this for handwritten text" and then I looked at the docs and facepalmed..
[+] [-] cup-of-tea|8 years ago|reply
[+] [-] CJefferson|8 years ago|reply
This would be great for blind people, as pdfed latex is extremely non-accessable, and I have to email authors of papers to get the original latex from them, which is often lost.
[+] [-] tgb|8 years ago|reply
[+] [-] froindt|8 years ago|reply
Many years ago I "translated" course materials into a form which was accessible to a blind grad student. It was a really interesting job and taught me a lot about accessibility.
I was effectively doing latex, but without all the leading \ characters. It made learning latex comparatively easy.
What interface do you use to read equations? Screen reader speaking the straight latex, or do you have some Middleware to make it more digestible when listened to?
[+] [-] JBorrow|8 years ago|reply
[+] [-] ocrcustomserver|8 years ago|reply
Mathpix only does the equation OCR part.
I've worked on this (for a PDF to HTML application), mail is in profile if you're interested.
[+] [-] jimnotgym|8 years ago|reply
https://www.springfieldspringfield.co.uk/view_episode_script...
[+] [-] saganus|8 years ago|reply
What kind of sorcery is this!?
Is this using deep learning or "regular" OpenCV or similar?
I would assume it's a highly tuned deep learning algo, but I'm not knowledgeable enough to distinguish a deep learning algo from a pile of rocks...
Edit: Aha, someone already asked this and got an answer.
https://news.ycombinator.com/item?id=16535467
[+] [-] typon|8 years ago|reply
Great software otherwise
[+] [-] bagrow|8 years ago|reply
Bug report: it appears that multiline summation subscripts are not recognized correctly. For example, Eq. 8 of [1]. These are often created using \substack as part of amsmath.
Awesome tool!
[1]: https://arxiv.org/pdf/1802.01194.pdf
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] sitkack|8 years ago|reply
I strongly suggest you talk to the publishers about integrating your tech into their TeX.
[+] [-] RhysU|8 years ago|reply
[+] [-] skiman10|8 years ago|reply
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] lowglow|8 years ago|reply
[+] [-] srush|8 years ago|reply
http://lstm.seas.harvard.edu/latex/
Here's how to do it with OpenNMT/PyTorch:
http://opennmt.net/OpenNMT-py/im2text.html
[+] [-] vinni2|8 years ago|reply
[+] [-] aashu_dwivedi|8 years ago|reply
[+] [-] Uninen|8 years ago|reply
[+] [-] lliiffee|8 years ago|reply
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] Ninn|8 years ago|reply
Also it would be nice of some info on the process. Does work entirely locally, or is images uploaded to the cloud?
[+] [-] howToLearnSpark|8 years ago|reply
[+] [-] simonramstedt|8 years ago|reply
- \mathcal letters (recognized as non-mathcal)
- long equations (not recognized at all)
- multi-line equations (not recognized at all)
The screenshot2latex tool: https://github.com/rmst/screenshot2latex/blob/master/scripts...
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] screye|8 years ago|reply
Coming from a grad student who hates writing equations in latex. I will probably try this out.
[+] [-] _emacsomancer_|8 years ago|reply
[+] [-] xfer|8 years ago|reply
[+] [-] shafyy|8 years ago|reply
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] andreareina|8 years ago|reply
[+] [-] nicodjimenez|8 years ago|reply
[+] [-] amelius|8 years ago|reply
[+] [-] ktpsns|8 years ago|reply
There are actually a number of ongoing research projects to establish standards of semantical mathematical representations. Probably one of the best funded running projects (budget ~10MEUR) which has a work package on this topic is http://opendreamkit.org/ . Work is going on at https://mathhub.info/ from my knowledge. I would like to provide a deep link but the site seems to be in a broken state. Apparently people are working on it right in the moment.
[+] [-] taeric|8 years ago|reply
The verbalization if most LaTeX commands can help learn to read the equations. Sometimes.
[+] [-] dcchambers|8 years ago|reply