This project is a collection of Test Automation requirements for the popular Apache Hadoop project.
Find out more about Apache Hadoop at wikipedia
A good starting point to get familiar with the topic is this post from Google enginers on testing challenges with distributed file system
What do you need to get started?
- Start by reviewing the testing projects listed at http://wiki.apache.org/hadoop/ProjectSuggestions#test_projects . This list may grow over time. The best way to make progress is pick an issue and read the Jira post to get the context.
- Next get upto speed on required testing tools.
- Create a new issue under this project with details on what you propose to do as a solution and share it with the mentor/owner.
- Once the proposal is reviewed and finalized get kracking
- After completing a logical portion get it reviewed by mentor. This helps avoid surprises later on especially for the first issue that you are handling
- Once complete notify mentor and submit code as a patch (preferrable) with the issue
- Next step is for mentor to review the patch and accept or provide review comments
Of course before you get to the testing requirements you would need a Hadoop setup to test. Best place is to start with the Hadoop wiki
Tools you should be familiar with :
- Junit : Unit testing tool
- Jira : Issue tracking , patch submission etc
Also browse through the developer mailing list to get familiar with the current issues at http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/