 |

1. What is Hadoop?
- Understanding distributed systems and Hadoop
- Comparing SQL databases and Hadoop
- Understanding MapReduce
- Counting words with Hadoop—running your first program
- History of Hadoop
2. Starting Hadoop
- The building blocks of Hadoop
- Setting up SSH for a Hadoop cluster
- Running Hadoop
- Web-based cluster UI
3. Components of Hadoop
- Working with files in HDFS
- Anatomy of a MapReduce program
- Reading and writing
4. Writing basic MapReduce programs
- Constructing the basic template of a MapReduce program
- Counting things
- Adapting for Hadoop's API changes
- Streaming in Hadoop
- Improving performance with combiners
5. Advanced MapReduce
- Chaining MapReduce jobs
- Joining data from different sources
- Creating a Bloom filter
6. Programming Practices
- Developing MapReduce programs
- Monitoring and debugging on a production cluster
- Tuning for performance
7. Cookbook
- Passing job-specific parameters to your tasks
- Probing for task-specific information
- Partitioning into multiple output files
- Inputting from and outputting to a database
- Keeping all output in sorted order
8. Managing Hadoop
- Setting up parameter values for practical use
- Checking system’s health
- Setting permissions
- Managing quotas
- Enabling trash
- Removing DataNodes
- Adding DataNodes
- Managing NameNode and Secondary NameNode
- Recovering from a failed NameNode
- Designing network layout and rack awareness
- Scheduling jobs from multiple users
9. Running Hadoop in the cloud
- Introducing Amazon Web Services
- Setting up AWS
- Setting up Hadoop on EC2
- Running MapReduce programs on EC2
- Cleaning up and shutting down your EC2 instances
- Amazon Elastic MapReduce and other AWS services
10. Programming with Pig
- Installing Pig
- Running Pig
- Learning Pig Latin through Grunt
- Speaking Pig Latin
- Working with user-defined functions
- Working with scripts
- Seeing Pig in action—example of computing similar patents
11. Hadoop Related Technologies
- Hive
- Apache Accumulo
- Other Hadoop-related stuff
12. Case studies
- Converting 11 million image documents from the New York Times archive
- Mining data at China Mobile
- Recommending the best websites at StumbleUpon
- Building analytics for enterprise search—IBM’s Project ES2
|
|
 |