Duration : 3 Days
Overview : Processing high-volume and high-velocity datasets remains a challenge for conventional data processing platforms and requires specific tools and technologies beyond the standard data analytics stack. This course introduces the key tools and architectures that are used to manage and process high volume and high velocity datasets. Tools and architectures that will be explored include Apache Hadoop, Apache Spark and more. This course will equip students to build modern big data solutions to solve real-world problems.
At Course Completion : After completing this course delegates will understand modern big data tools and architectures and be able to use them to build solutions that leverage massive data resources and are deployed at scale.
Who Should Attend : Programmers with existing data analytics expertise who want to leverage modern big data technologies to build solutions that scale
Pre-requisites : Delegates should be confident programmers with experience programming for data analytics projects.
Outline : The course will run over three days and will broadly follow the timetable shown below. The course will be delivered through presentations, real world examples, discussions, and workshops.
|Day 1||Morning||Introducing Big Data Technologies|
|Afternoon||Building and Using Distributed File Systems|
|Day 2||Morning||Query Languages & Environments|
|Afternoon||Processing Streaming Data|
|Day 3||Morning||Machine Learning at Scale|
|Afternoon||Deploying Big Data Solutions|