Agreed 100%. Grad school is a time to explore all kinds of CS fields (or various topics within a field you're interested in). Right now I wish I had 28-hour days to mess around with GPU computing, compiler design, language theory, and random machine learning projects.
EDIT: its also worth pointing out that Hadoop is not the 'be all, end all' framework for large-scale data analysis. Many problems do not fit the map-reduce paradigm, and are better suited for other frameworks (looking into GraphLab once I get free time, for example).
I am from Common Crawl. Apologies for the site being down! Too much traffic from HN :) We're working on getting it back up. The Google cache below has all the contents, so please refer to there for the moment. Here's the excerpted beginning..
Learn Hadoop and get a paper published
We’re looking for students who want to try out the Hadoop platform and get a technical report published.
Hadoop’s version of MapReduce will undoubtedbly come in handy in your future research, and Hadoop is a fun platform to get to know. Common Crawl, a nonprofit organization with a mission to build and maintain an open crawl of the web that is accessible to everyone, has a huge repository of open data – about 5 billion web pages – and documentation to help you learn these too
It's the start of the Summer, I'm in my 2nd year MS-CS. I began a simulation last night at 9PM that's still running (usually 14 hour turnaround). I just got in, sat at my desk and thought "I'm bored, guess I'll check HN" and THIS is what I see.
I've been authoring a post on solving interview questions in map/reduce between experimental runs myself, so I guess I already hit the "bored graduate student" point too...
I apologize for the off-topic comment but I am hoping a few of the folks on here who are familiar with Hadoop can help me with a small career decision.
I'm fortunate enough to be up for two systems positions with companies in my area, and one of them is part of a group maintaining a Hadoop cluster. I've never maintained Hadoop infrastructure before so I'm wondering if it's worth the "career capital" investment. There don't appear to be too many openings that I know of for Hadoop Sys admins, and while I don't expect to lose any skills working on it, I wonder if this platform is realistically expected to grow, and maybe become something I could maybe build a valuable niche set of skills for...
I would say go for it. Hadoop is becoming more of a norm in industry and I can tell you from personal experience that having prior hands-on Hadoop work makes your resume pop a little more.
Even better is if you're able to muck around in core Hadoop versus abstracted management layers like Cloudera. The things you have to learn in maintaining a Hadoop cluster like cloud systems management, tuning, running jobs, etc. makes for valuable experience, even if niche.
I don't see exploding demand in this. However, the experience will likely translate over to other domains and make you better for it. I'm not a sys admin but setup a small hadoop cluster ... but wow ... talk about learning experiences!
Perhaps I'm just jaded from NSDI a few weeks ago, but I'm tired of papers about Hadoop and systems like it. If you're going to do work with Hadoop, at least start comparing what you're doing with other work that improves Hadoop. There's been a bunch of papers about improving the scheduling or the shuffle phase or whatever but they all compare to vanilla Hadoop and not each other.
[+] [-] cwhittle|14 years ago|reply
[+] [-] achompas|14 years ago|reply
EDIT: its also worth pointing out that Hadoop is not the 'be all, end all' framework for large-scale data analysis. Many problems do not fit the map-reduce paradigm, and are better suited for other frameworks (looking into GraphLab once I get free time, for example).
[+] [-] scott_s|14 years ago|reply
[+] [-] LisaG|14 years ago|reply
I am from Common Crawl. Apologies for the site being down! Too much traffic from HN :) We're working on getting it back up. The Google cache below has all the contents, so please refer to there for the moment. Here's the excerpted beginning..
Learn Hadoop and get a paper published
We’re looking for students who want to try out the Hadoop platform and get a technical report published. Hadoop’s version of MapReduce will undoubtedbly come in handy in your future research, and Hadoop is a fun platform to get to know. Common Crawl, a nonprofit organization with a mission to build and maintain an open crawl of the web that is accessible to everyone, has a huge repository of open data – about 5 billion web pages – and documentation to help you learn these too
http://webcache.googleusercontent.com/search?q=cache:http://...
[+] [-] groundshop|14 years ago|reply
[+] [-] malloc47|14 years ago|reply
[+] [-] migpwr|14 years ago|reply
I'm fortunate enough to be up for two systems positions with companies in my area, and one of them is part of a group maintaining a Hadoop cluster. I've never maintained Hadoop infrastructure before so I'm wondering if it's worth the "career capital" investment. There don't appear to be too many openings that I know of for Hadoop Sys admins, and while I don't expect to lose any skills working on it, I wonder if this platform is realistically expected to grow, and maybe become something I could maybe build a valuable niche set of skills for...
Any help would be appreciated...
[+] [-] oacgnol|14 years ago|reply
Even better is if you're able to muck around in core Hadoop versus abstracted management layers like Cloudera. The things you have to learn in maintaining a Hadoop cluster like cloud systems management, tuning, running jobs, etc. makes for valuable experience, even if niche.
[+] [-] daviddumenil|14 years ago|reply
http://www.itjobswatch.co.uk/jobs/uk/hadoop.do
http://www.pcworld.com/businesscenter/article/255142/idc_exp...
http://www.mckinsey.com/Insights/MGI/Research/Technology_and...
Although I'd second the point that getting your hands dirty with the Hadoop core is likely more valuable than straight sysadmin work.
[+] [-] throwaway1979|14 years ago|reply
[+] [-] RobAtticus|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] Bootvis|14 years ago|reply
[+] [-] brucehart|14 years ago|reply
[+] [-] denzil_correa|14 years ago|reply
[+] [-] LisaG|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]