Course description
In this course we are going to look at the necessity of big data in today’s world and how it fits into your organizations future. Then we will look at one big data framework in particular, Hadoop, as it is fully open source and driven by the community. We will examine some of the pieces that comprise Hadoop and demonstrate some of its functionality. There are so many use cases where big data can enhance your organizations competitive edge - analyzing social media, sensor data, click stream data, geographic analysis, emails, the list goes on. Hopefully you have a better understanding, not only of what big data and Hadoop are, but, more importantly, where they fit into your organizations structure and what they bring to the table.
Prerequisites
This course assumes that the users have an understanding of working with databases and database systems. The user should also be familiar with syntax commands for Linux.
Learning Paths
This course is part of the following LearnNowOnline SuccessPaths™:
Hadoop
Meet the expert
Barry Solomon has over 23 years of experience as a consultant. He has developed with Fortran, C, C , Visual Basic, Java, and Visual C#. His extensive database experience includes working with Microsoft Access, Microsoft SQL Server, MySQL, and Oracle. His expertise now includes working with big data, Hadoop in particular, and all of its attending ecosystems as the limitations have been exceeded in most modern database systems.
Course outline
What is Big Data
Purpose of Big Data (40:41)
- Introduction (00:22)
- End of the Line (05:16)
- OLTP and OLAP (03:07)
- Storage (02:39)
- Big Data as Supercomputer (05:11)
- Scalability (02:22)
- Hard Drives (03:20)
- Parallelism (02:13)
- Whose Data is it? (04:16)
- Being Competitive and Relevant (03:47)
- What is Big Data (02:06)
- Variety, velocity and volume (01:31)
- Leveraging and ROI (01:42)
- Data Data Everywhere (01:38)
- Throw it in the Lake of Data (00:52)
- Summary (00:10)
Use Cases (13:22)
- Introduction (00:15)
- Use Cases (02:38)
- Real Time vs Batch Processing (01:15)
- What About Databases (02:10)
- OLTP and OLAP (02:15)
- Appliances (00:57)
- Mix and Match (01:03)
- Schema on Write, on Read (00:50)
- NoSQL (01:40)
- Summary (00:12)
Hadoop
Hadoop (37:06)
- Introduction (00:16)
- What do I get (01:01)
- Hadoop (02:49)
- File System (01:56)
- MapReduce (01:18)
- YARN (02:02)
- Ecosystem (03:03)
- Pig (03:30)
- Hive (03:42)
- Mahout and Oozie (03:05)
- NoSQL (00:20)
- Sqoop (01:36)
- Ambari (01:51)
- ZooKeeper (01:20)
- The other pieces (07:16)
- Tez (01:37)
- Summary (00:16)
Hadoop Demo (29:50)
- Introduction (00:20)
- Where do we go? (05:44)
- Demo: Download (02:32)
- Demo: Putty (01:01)
- Demo: Web Interface (03:56)
- Demo: Back to Putty (02:57)
- Demo: PIG (03:00)
- Demo: HIVE Table (05:40)
- Demo: Ambari (02:30)
- Demo: Query (01:56)
- Summary (00:09)