top | item 36174815

Show HN: Google Code Jam Archive

103 points| zibada | 2 years ago |zibada.guru

Hi HN,

As many of you already know, Google has discontinued their coding competitions and is shutting down the competitions website. Before it's gone forever, I've scraped everything I could from their website and set up my own archive.

(Disclaimer: I'm not affiliated with Google besides participating in these contests years ago)

While the initial version was ready back in March, today I'm rolling out big and hopefully the final update, so this might be a good time to share it with the HN community.

Everything is static HTML and SQLite archives (with a very minimal backend to serve files directly from SQLite), so you can easily grab your own copy. May these 3.5 million source files be a nice training dataset for some fancy future AI!

Some useless stats:

Tabs or spaces? 28% tabs vs 72% spaces for the whole dataset, but quite unexpectedly, 66% tabs vs 34% spaces if we consider only the final rounds.

Most used languages: 63% C/C++, 20% Python, 12% Java, 2% C#, with others less than 1% each. For the final rounds it's 83% C/C++, 13% Java, 2% Python.

18 comments

order

srvmshr|2 years ago

I'm surprised so few people are interested in this. It's a very good collection. I think it'd be very useful for people wanting to have a go at it, now that google has sunsetted it. I am not sure how it is hosted but maybe a GitHub repo with source files could ensure they'll be archived even better for posterity. OP have you posted them anywhere on GitHub/Gitlab?

Edit: My bad, I saw the downloadable zip archives for every year. This is nice.

(Also if we want to mirror this to help individually - can you post some brief howto. Happy to mirror & share the load)

zibada|2 years ago

For the problem statements, there is official github repo (see link in the next comment). As for my zip archives, you basically unzip them all somewhere, point the webserver to it and you are all set (+optionally basic Apache+mod_php setup or equivalent to serve user profiles and solutions).

fn-mote|2 years ago

Ah, somehow I missed the part of the announcement where "the site will be fully shutting down on July 1, 2023".

WHY, GOOGLE???

Those problems are beautiful. What would it cost to leave the site up??

Leaving us absolutely nothing for posterity? Scorched earth.

:(

lrem|2 years ago

I'll offer a guess educated by almost a decade of working for Google: security treadmill. How can you trust something to stay secure for more than a couple weeks? Borg will outright refuse to run any binary that's been compiled a couple months ago. Couple that with notorious internal API instability culture and you get just this - can't hope to have a thing just stay up by itself. So they did the better thing here - give a schedule for the turndown, instead of "this will stop serving a couple months after some binary no longer compiles".

forty|2 years ago

This is really cool, thanks!

I noticed that it's missing results from Distributed Code Jam, which happened 2 or 3 years only, and I'm partial for it since that's the only Codejam competition where I have ever managed to win a t-shirt:D

29athrowaway|2 years ago

You should contact the Internet Archive and get these archived there as well.

progbits|2 years ago

I don't have anything to add but since you didn't get any comments so far let me just say very nice job. Much easier and faster than navigating this on archive.org. I'll definitely be mirroring the archives later.

zaptheimpaler|2 years ago

Nice job, looks really cool! I might use this to help me prepare for upcoming interviews.

giuscri|2 years ago

omg, is leetcoding still a thing in a post-FAANG world?

lintim|2 years ago

This is cool! Thanks especially for sharing the db files- that is very helpful work.