Building the Data Pipeline

collapse

Course Details

This course can only be taken as part of the Certificate in Big Data Technologies.

Get Program Details

About this Course


This course focuses on the process used to acquire, store and process data for downstream analysis. You'll analyze and compare available technologies in order to make informed decisions as data engineers. You'll also learn how to build data processing workflows using several data stack platforms and design and run data pipelines for real-world business use cases. 

What You’ll Learn

  • How a data lake can enhance the usability of your organization’s data 
  • Batch and streaming processing using Spark, Flink and other processing tools 
  • How to use Kafka to enable low-latency and real-time processing 
  • Data acquisition and modeling techniques 
  • Pipeline design and integration

Get Hands-On Experience

  • Organize and store data in a data lake and handle updates and changes to your data
  • Use Spark to connect to different data sources and process batch and streaming data 
  • Design and build a complete end-to-end data pipeline to support a realistic business case 

Program Overview

This course is part of the Certificate in Big Data Technologies.

  Stay up to date with emails featuring career tips, event invitations and program updates.       Sign Up Now