Vesuvius Challenge

[+] Matt_Cutts|3 years ago|reply

For what it's worth, I worked with Dr. Seales while getting my undergrad degree. Happy to vouch that he and his team are great humans.

This is such a fascinating problem, and could have real benefits for society. Imagine uncovering ancient works that would otherwise be lost.

[+] janpaul123|3 years ago|reply

Awesome!! They're so great; it's been such a joy to be welcomed into their project.

And totally — especially if there is indeed a larger library still waiting to be excavated. Who knows how many completely unique texts are waiting to be read, if we just have the right method to do so?!

[+] ccooffee|3 years ago|reply

The images/animations in this page are fantastic at visually explaining something quite complicated. I would not have been able to understand the difficulty without them.

[+] natfriedman|3 years ago|reply

Credit for those goes to Jonny Hyman, who also does animations for Veritasium, Dejan Gotić, who did the fancy 3d animations, and JP Posma, who directed the entire project!

[+] mkaic|3 years ago|reply

Man, there's probably a dozen different loosely-formed ideas floating in my head right now — kudos on the exciting presentation, you really make the problem seem interesting. I may just have to give this a shot, though I think the odds of me figuring it out are exceptionally low. Still, working on cutting-edge problems is motivating, even if they're above my pay grade :)

You've nerd-sniped me good and proper. May the best team win!

[+] janpaul123|3 years ago|reply

Haha yesss, nerd-sniping is the goal!

I believe that there could be quite a few different ways in which this could get solved. The potential solution space is huge, so you might just stumble upon something interesting if you wander places where no one else is looking..

Good luck!!

[+] janpaul123|3 years ago|reply

One of the organizers here! Would love to welcome everyone's questions, ideas, etc! :)

[+] all2|3 years ago|reply

Another note: the demo data [0] is behind an HTTP link. Consider getting it behind HTTPS. My browser was complaining about downloading the data from plain HTTP.

[0] https://gist.github.com/janpaul123/280262ebce904f7366fe4cc15...

[+] all2|3 years ago|reply

More notes: your registration form to access the data requires that someone have a google account. While this isn't an issue for most, I'm not comfortable doing anything with Google anymore and I have as little to do with them as possible.

[+] all2|3 years ago|reply

How do I get my hands on the data set? There's no indication on the site about how to make an attempt.

[+] jawns|3 years ago|reply

Is this the most cost-effective way to achieve the desired outcome?

[+] fortenforge|3 years ago|reply

This was the outcome of https://nat.org/puzzle

HN discussion: https://news.ycombinator.com/item?id=33735503

Some did guess correctly that it was about decoding the Herculaneum papyri

[+] janpaul123|3 years ago|reply

Haha, indeed!

[+] glfharris|3 years ago|reply

Fantastic project. Amazing to think that there's an entire first century library just waiting for the technology to be read.

[+] janpaul123|3 years ago|reply

Right?! Who knows what could be waiting in a library owned by leaders of the Roman Empire

[+] SubiculumCode|3 years ago|reply

I do a lot of brain image segmentation in my research using multiatlas image segmentation, which involves diffeomorphic image registration from multiple labeled atlases...but the amount of curling in on itself of these layered sheets seems a daunting problem for a fully automated pipeline.

[+] irrational|3 years ago|reply

People spent many many decades laboriously putting tiny little dead sea scroll fragments back together like the world's worst jigsaw puzzle. I think that shows if there is a way to do this that takes a lot of tedious manual labor over many decades, there are people who will be willing to do that. They just need the tools to do the work without destroying the scrolls.

[+] natfriedman|3 years ago|reply

It seems quite possible that the solution isn't fully automated. N is in the hundreds. And modern AI does, in fact, involve quite a lot of hand crafted data...

[+] thih9|3 years ago|reply

Easter egg: if you click on the “days remaining” at the bottom of the page, it changes to Roman numerals :)

[+] SubiculumCode|3 years ago|reply

This needs to be solved ASAP. Chat-gpt needs more training data!

but seriously, way cool project.

[+] unknown|3 years ago|reply

[deleted]

[+] janpaul123|3 years ago|reply

More data!!

[+] shrx|3 years ago|reply

Why has the team not tried to use terahertz tomography, if X rays gave such poor signal to noise ratio?

[+] dreamcompiler|3 years ago|reply

Or gamma rays?

I'd guess terahertz might not provide sufficient resolution or penetrate deeply enough. Or maybe not even provide better discrimination than X rays.

[+] johnnyo|3 years ago|reply

I guess this is the answer to this post from 3 months ago

https://news.ycombinator.com/item?id=33735503

[+] jcuenod|2 years ago|reply

Someone called it: https://news.ycombinator.com/item?id=33736675

> Maybe decoding Herculaneum scrolls?

[+] thih9|3 years ago|reply

Confirmed here: https://news.ycombinator.com/item?id=35176585

[+] localplume|3 years ago|reply

Reminds me of Kuzushiji recognition with ML, transcribing historical Japanese documents. Both are my favorite applications of ML: deciphering the past. This is really damn cool.

[+] janpaul123|3 years ago|reply

<3

32 comments