You will learn how to use Apache Hadoop and write MapReduce programs. You will begin with a quick overview of installing Hadoop, setting it up in a cluster, and then proceed to writing data analytic programs. The course will present the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action.
The course will further examine related technologies such as Hive, Pig, and Apache Accumulo. Apache Accumulo is a highly scalable structured store based on Google's BigTable, written in Java and operates over the Hadoop Distributed File System (HDFS). Hive is data warehouse software for querying and managing large datasets. Pig is a platform to take advantage of parallelization when running data analysis.
Finally you will observe examine how Hadoop works in and supports cloud computing and explore examples with Amazon Web Services and case studies.
The course is intended for programmers, architects, and project managers who have to process large amounts of data offline. Students should have a basic familiarity with Linux administration and Java, as most code examples will be written in Java. Familiarity with basic statistical concepts (e.g. histogram, correlation) is helpful to appreciate the more advanced data processing examples.
This course will help prepare students for the CCDH: Cloudera Certified Developer for Apache Hadoop. Students will receive a voucher for the exam and can take the certification exam on site at UMBC Training Centers any time after completion of the course.
Familiarity with Java and Linux
Register for Session
|July 29 - August 1, 2013
||Monday - Thursday
8:30 a.m. - 4:30 p.m.
- Click here to download and complete the registration form to mail in or fax.
E-mail Heith Hart or call (443) 692-6599 if you have any questions about this course or if you would like to be added to the interest list.