Awesome!! They're so great; it's been such a joy to be welcomed into their project.
And totally — especially if there is indeed a larger library still waiting to be excavated. Who knows how many completely unique texts are waiting to be read, if we just have the right method to do so?!
The images/animations in this page are fantastic at visually explaining something quite complicated. I would not have been able to understand the difficulty without them.
Credit for those goes to Jonny Hyman, who also does animations for Veritasium, Dejan Gotić, who did the fancy 3d animations, and JP Posma, who directed the entire project!
Man, there's probably a dozen different loosely-formed ideas floating in my head right now — kudos on the exciting presentation, you really make the problem seem interesting. I may just have to give this a shot, though I think the odds of me figuring it out are exceptionally low. Still, working on cutting-edge problems is motivating, even if they're above my pay grade :)
You've nerd-sniped me good and proper. May the best team win!
I believe that there could be quite a few different ways in which this could get solved. The potential solution space is huge, so you might just stumble upon something interesting if you wander places where no one else is looking..
Another note: the demo data [0] is behind an HTTP link. Consider getting it behind HTTPS. My browser was complaining about downloading the data from plain HTTP.
More notes: your registration form to access the data requires that someone have a google account. While this isn't an issue for most, I'm not comfortable doing anything with Google anymore and I have as little to do with them as possible.
I do a lot of brain image segmentation in my research using multiatlas image segmentation, which involves diffeomorphic image registration from multiple labeled atlases...but the amount of curling in on itself of these layered sheets seems a daunting problem for a fully automated pipeline.
People spent many many decades laboriously putting tiny little dead sea scroll fragments back together like the world's worst jigsaw puzzle. I think that shows if there is a way to do this that takes a lot of tedious manual labor over many decades, there are people who will be willing to do that. They just need the tools to do the work without destroying the scrolls.
It seems quite possible that the solution isn't fully automated. N is in the hundreds. And modern AI does, in fact, involve quite a lot of hand crafted data...
Reminds me of Kuzushiji recognition with ML, transcribing historical Japanese documents. Both are my favorite applications of ML: deciphering the past. This is really damn cool.
[+] [-] Matt_Cutts|3 years ago|reply
This is such a fascinating problem, and could have real benefits for society. Imagine uncovering ancient works that would otherwise be lost.
[+] [-] janpaul123|3 years ago|reply
And totally — especially if there is indeed a larger library still waiting to be excavated. Who knows how many completely unique texts are waiting to be read, if we just have the right method to do so?!
[+] [-] ccooffee|3 years ago|reply
[+] [-] natfriedman|3 years ago|reply
[+] [-] mkaic|3 years ago|reply
You've nerd-sniped me good and proper. May the best team win!
[+] [-] janpaul123|3 years ago|reply
I believe that there could be quite a few different ways in which this could get solved. The potential solution space is huge, so you might just stumble upon something interesting if you wander places where no one else is looking..
Good luck!!
[+] [-] janpaul123|3 years ago|reply
[+] [-] all2|3 years ago|reply
[0] https://gist.github.com/janpaul123/280262ebce904f7366fe4cc15...
[+] [-] all2|3 years ago|reply
[+] [-] all2|3 years ago|reply
[+] [-] jawns|3 years ago|reply
[+] [-] fortenforge|3 years ago|reply
HN discussion: https://news.ycombinator.com/item?id=33735503
Some did guess correctly that it was about decoding the Herculaneum papyri
[+] [-] janpaul123|3 years ago|reply
[+] [-] glfharris|3 years ago|reply
[+] [-] janpaul123|3 years ago|reply
[+] [-] SubiculumCode|3 years ago|reply
[+] [-] irrational|3 years ago|reply
[+] [-] natfriedman|3 years ago|reply
[+] [-] thih9|3 years ago|reply
[+] [-] SubiculumCode|3 years ago|reply
but seriously, way cool project.
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] janpaul123|3 years ago|reply
[+] [-] shrx|3 years ago|reply
[+] [-] dreamcompiler|3 years ago|reply
I'd guess terahertz might not provide sufficient resolution or penetrate deeply enough. Or maybe not even provide better discrimination than X rays.
[+] [-] johnnyo|3 years ago|reply
https://news.ycombinator.com/item?id=33735503
[+] [-] jcuenod|2 years ago|reply
> Maybe decoding Herculaneum scrolls?
[+] [-] thih9|3 years ago|reply
[+] [-] localplume|3 years ago|reply
[+] [-] janpaul123|3 years ago|reply