Monday, May 18, 2009

Hadoop

Everyone is buzz wording about Hadoop so I decided to check the core source code out and try to load it into Eclipse. Talk about fail. I read the wiki page for setting things up. It starts off recommending the exact eclipse subversion plugin that I don't use. Ok, that shouldn't be a problem, just check out to disk and import the project like I normally do. Ok, done.

Next problem, you have to use ant to generate the eclipse files. Um, what? Why can't you just check the .project/.classpath files in? They don't change that often and really don't need to be customized.

But wait, they do. Hadoop's eclipse integration has direct ties to running ant because it needs to generate some files which are then used as part of the classpath so that things build within eclipse. Nope, that doesn't work either. I tried.

Basically the summary is that after about 30 minutes of dicking around with things, I have no idea on how to get a clean build of Hadoop within Eclipse. This of course makes trying to contribute some code to Hadoop nearly impossible.

What really should happen is that someone needs to write some clear directions and have an easy path to setting up a working development environment. This should be a number one priority of all open source software projects. If you can't get the build down, what makes me think you can write good code?

3 comments:

Jeff Hammerbacher said...

Hey,

Not sure if it will help, but try this screencast? http://www.cloudera.com/blog/2009/04/20/configuring-eclipse-for-hadoop-development-a-screencast/

Jon Scott Stevens said...

That screencast definitely shows it is possible to set it up to build it. Thanks for the link. I wish the wiki was updated to point to the screencast.

I also question the need to generate the files every time. Just check the files into svn and be done with it.

Wayne said...

I totally agree about the clear writing of a "how-to-setup-dev-env".

I have even encountered that at the office.