McAfee-Secured Website

Course name Software Engineering Courses Analyzing Large Data Sets with Apache Spark: Analyzing Large Data Sets with Apache Spark

Analyzing Large Data Sets with Apache Spark Video Course

Analyzing Large Data Sets with Apache Spark Video Course is developed by Software Engineering Courses Professionals to help you pass the Analyzing Large Data Sets with Apache Spark exam.

You Will Learn:

Was $21.99
Now $19.99

Description

This course will improve your knowledge and skills required to pass Analyzing Large Data Sets with Apache Spark exam.

Curriculum For This Course

  • 1. Getting Started with Spark 5 Videos 00:27:14
    • Introduction 02:16
    • How to Use This Course 01:41
    • [Activity]Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies. 14:50
    • [Activity] Installing the MovieLens Movie Rating Dataset 03:35
    • [Activity] Run your first Spark program! Ratings histogram example. 04:52
  • 2. Spark Basics and Simple Examples 11 Videos 01:34:28
    • Introduction to Spark 10:11
    • The Resilient Distributed Dataset (RDD) 12:17
    • Ratings Histogram Walkthrough 13:33
    • Key/Value RDD's, and the Average Friends by Age Example 16:13
    • [Activity] Running the Average Friends by Age Example 05:39
    • Filtering RDD's, and the Minimum Temperature by Location Example 08:10
    • [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums 05:08
    • [Activity] Running the Maximum Temperature by Location Example 03:21
    • [Activity] Counting Word Occurrences using flatmap() 07:28
    • [Activity] Improving the Word Count Script with Regular Expressions 04:44
    • [Activity] Sorting the Word Count Results 07:44
  • 3. Advanced Examples of Spark Programs 10 Videos 01:12:40
    • [Activity] Find the Most Popular Movie 05:52
    • [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers 08:23
    • Find the Most Popular Superhero in a Social Graph 04:29
    • [Activity] Run the Script - Discover Who the Most Popular Superhero is! 06:00
    • Superhero Degrees of Separation: Introducing Breadth-First Search 07:54
    • Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark 06:44
    • [Activity] Superhero Degrees of Separation: Review the Code and Run it 09:14
    • Item-Based Collaborative Filtering in Spark, cache(), and persist() 10:12
    • [Activity] Running the Similar Movies Script using Spark's Cluster Manager 10:54
    • [Exercise] Improve the Quality of Similar Movies 02:58
  • 4. Running Spark on a Cluster 8 Videos 00:49:01
    • Introducing Elastic MapReduce 05:08
    • [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY 09:55
    • Partitioning 04:21
    • Create Similar Movies from One Million Ratings - Part 1 05:12
    • [Activity] Create Similar Movies from One Million Ratings - Part 2 11:27
    • Create Similar Movies from One Million Ratings - Part 3 03:28
    • Troubleshooting Spark on a Cluster 03:43
    • More Troubleshooting, and Managing Dependencies 05:47
  • 5. SparkSQL, DataFrames, and DataSets 3 Videos 00:20:16
    • Introducing SparkSQL 06:08
    • Executing SQL commands and SQL-style functions on a DataFrame 08:16
    • Using DataFrames instead of RDD's 05:52
  • 6. Other Spark Technologies and Libraries 5 Videos 00:31:06
    • Introducing MLLib 08:10
    • [Activity] Using MLLib to Produce Movie Recommendations 02:56
    • Analyzing the ALS Recommendations Results 04:53
    • Using DataFrames with MLLib 07:31
    • Spark Streaming and GraphX 07:36