Hadoop and Spark Fundamentals: Unit 2

Discover new skills with 30% off courses from industry experts. Save now.

Hadoop and Spark Fundamentals: Unit 2

This course is part of Hadoop and Spark Fundamentals Specialization

Instructor: Pearson

Included with Coursera Plus

Learn more

1 module

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

6 hours to complete

Flexible schedule

Learn at your own pace

1 module

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

6 hours to complete

Flexible schedule

Learn at your own pace

What you'll learn

Understand and implement Hadoop MapReduce for distributed data processing, including compiling, running, and debugging applications.
Apply advanced MapReduce techniques to real-world scenarios such as log analysis and large-scale text processing.
Utilize higher-level tools like Apache Pig and Hive QL to streamline data workflows and perform complex queries.
Gain hands-on experience with Apache Spark and PySpark for modern, scalable data analytics.

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Hadoop and Spark Fundamentals Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There is 1 module in this course

This course introduces the fundamentals of modern data processing for data engineers, analysts, and IT professionals. You will learn the basics of Hadoop MapReduce, including how it works, how to compile and run Java MapReduce programs, and how to debug and extend them using other languages. The course includes practical exercises such as word counts across multiple files, log file analysis, and large-scale text processing with datasets like Wikipedia. You will also cover advanced MapReduce features and use tools like Yarn and the Job Browser. The course then covers higher-level tools such as Apache Pig and Hive QL for managing data workflows and running SQL-like queries. Finally, you will work with Apache Spark and PySpark to gain experience with modern data analytics platforms. By the end of the course, you will have practical skills to work with big data in various environments.

This module introduces the core components of big data processing with Hadoop and Spark. It covers the fundamentals of Hadoop MapReduce, including its operation, programming, and debugging, followed by practical examples such as word count, log analysis, and benchmarking. The module then explores higher-level tools like Apache Pig and Hive for simplified data processing. Finally, it introduces Apache Spark and its Python interface, PySpark, highlighting Spark’s growing role in data analytics.

What's included

20 videos4 assignments

20 videosTotal 241 minutes

Learning objectives0 minutes
Understand the MapReduce paradigm7 minutes
Develop and run a Java MapReduce application15 minutes
Understand how MapReduce works20 minutes
Learning objectives0 minutes
Use the Streaming Interface10 minutes
Use the Pipes interface7 minutes
Run the Hadoop grep example6 minutes
Debugging MapReduce11 minutes
Understand Hadoop Version 2 MapReduce7 minutes
Use Hadoop Version 2 features—Part 121 minutes
Use Hadoop Version 2 features—Part 217 minutes
Learning objectives0 minutes
Demonstrate a Pig example7 minutes
Demonstrate a Hive example6 minutes
Demonstrate an Oozie example—Part 128 minutes
Demonstrate an Oozie example—Part 217 minutes
Learning objectives0 minutes
Learn Spark language basics39 minutes
Demonstrate a PySpark command line example12 minutes

4 assignmentsTotal 120 minutes

Hadoop MapReduce Quiz30 minutes
Hadoop MapReduce Examples Quiz30 minutes
Higher Level Tools Quiz30 minutes
Using the Spark Language Quiz30 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Pearson

230 Courses2,438 learners

Offered by

Pearson

Explore more from Data Management

Status: Free Trial
IBM
Introduction to Big Data with Spark and Hadoop
Course
Packt
The Ultimate Hands-On Hadoop
Course
Status: Free Trial
Johns Hopkins University
Data Analysis Using Hadoop Tools
Course
Status: Free Trial
Duke University
Spark, Hadoop, and Snowflake for Data Engineering
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.

If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.

Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.