Azure Data Engineering
In this Azure Data Engineering training course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others.
Course Rating :
4.8 (926)

Course Overview
In this Azure Data Engineering training course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.
Key Points
In this course, you will learn how to:
- Explore compute and storage options for data engineering workloads in Azure.
- Run interactive queries using serverless SQL pools.
- Perform data Exploration and Transformation in Azure Databricks.
- Explore, transform, and load data into the Data Warehouse using Apache Spark
- Ingest and load Data into the Data Warehouse.
- Transform Data with Azure Data Factory or Azure Synapse Pipelines.
- Integrate Data from Notebooks with Azure Data Factory or Azure Synapse Pipelines.
- Support Hybrid Transactional Analytical Processing (HTAP) with Azure Synapse Link.
- Perform end-to-end security with Azure Synapse Analytics.
- Perform real-time Stream Processing with Stream Analytics.
- Create a Stream Processing Solution with Event Hubs and Azure Databricks
Course Curriculum
Microsoft Azure provides a comprehensive platform for data engineering; but what is data engineering?
Complete this module to find out.
In this module you will learn how to:
- Identify common data engineering tasks
- Describe common data engineering concepts
- Identify Azure services for data engineering
Data lakes are a core element of data analytics architectures. Azure Data Lake Storage Gen2 provides a
scalable, secure, cloud-based solution for data lake storage.
In this module you will learn how to:
- Describe the key features and benefits of Azure Data Lake Storage Gen2
- Enable Azure Data Lake Storage Gen2 in an Azure Storage account
- Compare Azure Data Lake Storage Gen2 and Azure Blob storage
- Describe where Azure Data Lake Storage Gen2 fits in the stages of analytical processing
- Describe how Azure data Lake Storage Gen2 is used in common analytical workloads
Learn about the features and capabilities of Azure Synapse Analytics - a cloud-based platform for big data
processing and analysis.
In this module, you'll learn how to:
- Identify the business problems that Azure Synapse Analytics addresses.
- Describe core capabilities of Azure Synapse Analytics.
- Determine when to use Azure Synapse Analytics.
With Azure Synapse serverless SQL pool, you can leverage your SQL skills to explore and analyze data
in files, without the need to load the data into a relational database.
After the completion of this module, you will be able to:
- Identify capabilities and use cases for serverless SQL pools in Azure Synapse Analytics
- Query CSV, JSON, and Parquet files using a serverless SQL pool
- Create external database objects in a serverless SQL pool
By using a serverless SQL pool in Azure Synapse Analytics, you can use the ubiquitous SQL language to
transform data in files in a data lake.
After completing this module, you'll be able to:
- Use a CREATE EXTERNAL TABLE AS SELECT (CETAS) statement to transform data.
- Encapsulate a CETAS statement in a stored procedure.
- Include a data transformation stored procedure in a pipeline.
Why choose between working with files in a data lake or a relational database schema? With lake
databases in Azure Synapse Analytics, you can combine the benefits of both.
After completing this module, you will be able to:
- Understand lake database concepts and components
- Describe database templates in Azure Synapse Analytics
- Create a lake database
Apache Spark is a core technology for large-scale data analytics. Learn how to use Spark in Azure
Synapse Analytics to analyze and visualize data in a data lake.
After completing this module, you will be able to:
- Identify core features and capabilities of Apache Spark.
- Configure a Spark pool in Azure Synapse Analytics.
- Run code to load, analyze, and visualize data in a Spark notebook.
Data engineers commonly need to transform large volumes of data. Apache Spark pools in Azure Synapse
Analytics provide a distributed processing platform that they can use to accomplish this goal.
In this module, you will learn how to:
- Use Apache Spark to modify and save dataframes
- Partition data files for improved performance and scalability.
- Transform data with SQL
Delta Lake is an open source relational storage area for Spark that you can use to implement a data
lakehouse architecture in Azure Synapse Analytics.
In this module, you'll learn how to:
- Describe core features and capabilities of Delta Lake.
- Create and use Delta Lake tables in a Synapse Analytics Spark pool.
- Create Spark catalog tables for Delta Lake data.
- Use Delta Lake tables for streaming data.
- Query Delta Lake tables from a Synapse Analytics SQL pool.
Relational data warehouses are a core element of most enterprise Business Intelligence (BI) solutions, and
are used as the basis for data models, reports, and analysis.
In this module, you'll learn how to:
- Design a schema for a relational data warehouse.
- Create fact, dimension, and staging tables.
- Use SQL to load data into data warehouse tables.
- Use SQL to query relational data warehouse tables.
A core responsibility for a data engineer is to implement a data ingestion solution that loads new data into
a relational data warehouse.
In this module, you'll learn how to:
- Load staging tables in a data warehouse
- Load dimension tables in a data warehouse
- Load time dimensions in a data warehouse
- Load slowly changing dimensions in a data warehouse
- Load fact tables in a data warehouse
- Perform post-load optimizations in a data warehouse
Pipelines are the lifeblood of a data analytics solution. Learn how to use Azure Synapse Analytics
pipelines to build integrated data solutions that extract, transform, and load data across diverse systems.
In this module, you will learn how to:
- Describe core concepts for Azure Synapse Analytics pipelines.
- Create a pipeline in Azure Synapse Studio.
- Implement a data flow activity in a pipeline.
- Initiate and monitor pipeline runs.
Apache Spark provides data engineers with a scalable, distributed data processing platform, which can be
integrated into an Azure Synapse Analytics pipeline.
In this module, you will learn how to:
- Describe notebook and pipeline integration.
- Use a Synapse notebook activity in a pipeline.
- Use parameters with a notebook activity.
Learn how hybrid transactional / analytical processing (HTAP) can help you perform operational analytics
with Azure Synapse Analytics.
After completing this module, you'll be able to:
- Describe Hybrid Transactional / Analytical Processing patterns.
- Identify Azure Synapse Link services for HTAP.
Azure Synapse Link for SQL enables low-latency synchronization of operational data in a relational
database to Azure Synapse Analytics.
In this module, you'll learn how to:
- Understand key concepts and capabilities of Azure Synapse Link for SQL.
- Configure Azure Synapse Link for Azure SQL Database.
- Configure Azure Synapse Link for Microsoft SQL Server.
Azure Stream Analytics enables you to process real-time data streams and integrate the data they contain
into applications and analytical solutions.
In this module, you'll learn how to:
- Understand data streams.
- Understand event processing.
- Understand window functions.
- Get started with Azure Stream Analytics.
Azure Stream Analytics provides a real-time data processing engine that you can use to ingest streaming
event data into Azure Synapse Analytics for further analysis and reporting.
After completing this module, you'll be able to:
- Describe common stream ingestion scenarios for Azure Synapse Analytics.
- Configure inputs and outputs for an Azure Stream Analytics job.
- Define a query to ingest real-time data into Azure Synapse Analytics.
- Run a job to ingest real-time data, and consume that data in Azure Synapse Analytics.
By combining the stream processing capabilities of Azure Stream Analytics and the data visualization
capabilities of Microsoft Power BI, you can create real-time data dashboards.
In this module, you'll learn how to:
- Configure a Stream Analytics output for Power BI.
- Use a Stream Analytics query to write data to Power BI.
- Create a real-time data visualization in Power BI.
In this module, you'll evaluate whether Microsoft Purview is the right choice for your data discovery and
governance needs.
By the end of this module, you'll be able to:
- Evaluate whether Microsoft Purview is appropriate for your data discovery and governance needs.
- Describe how the features of Microsoft Purview work to provide data discovery and governance.
Learn how to integrate Microsoft Purview with Azure Synapse Analytics to improve data discoverability
and lineage tracking.
After completing this module, you'll be able to:
- Catalog Azure Synapse Analytics database assets in Microsoft Purview.
- Configure Microsoft Purview integration in Azure Synapse Analytics.
- Search the Microsoft Purview catalog from Synapse Studio.
- Track data lineage in Azure Synapse Analytics pipelines activities.
Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache
Spark.
In this module, you'll learn how to:
- Provision an Azure Databricks workspace.
- Identify core workloads and personas for Azure Databricks.
- Describe key concepts of an Azure Databricks solution
Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to
transform, analyze and visualize data at scale.
In this module, you'll learn how to:
- Describe key elements of the Apache Spark architecture.
- Create and configure a Spark cluster.
- Describe use cases for Spark.
- Use Spark to process and analyze data stored in files.
- Use Spark to visualize data
Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data
engineering processes at cloud scale.
In this module, you'll learn how to:
- Describe how Azure Databricks notebooks can be run in a pipeline.
- Create an Azure Data Factory linked service for Azure Databricks.
- Use a Notebook activity in a pipeline.
- Pass parameters to a notebook.
Learning Outcome
Upon completing a Azure Data Engineering course, participants can expect to achieve the following learning outcomes:
- Understand core data concepts and Azure data services.
- Implement data storage solutions using Azure Blob Storage and Azure Data Lake Storage.
- Design and build ETL pipelines with Azure Data Factory.
- Integrate and orchestrate data from various sources using Azure Synapse Analytics and Azure Stream Analytics.
- Develop and optimize data models for business intelligence with Azure Analysis Services and Power BI.
- Monitor and manage data solutions for performance and reliability using Azure Monitor and Log Analytics.
- Ensure data security and compliance with Azure Policy and Azure Blueprints.
- Deploy and maintain big data solutions using Azure HDInsight and Databricks.
- Prepare for the Microsoft Certified: Azure Data Engineer Associate certification exam.
- Gain hands-on experience through labs, projects, and real-world scenarios in an Azure environment.
Who this course is for?
Following are the professionals who can advance in their career by learning Azure Data Engineering training:
- Data Analysts
- Data Engineers
- Data Scientists
- Database Architects
- IT professionals and Freshers who wish to build their career in advanced data warehouse tools.
FAQs
The Azure Data Engineering course is designed to provide comprehensive training on data management, data integration, data transformation, and data analytics using Microsoft Azure's data services.
This course is ideal for aspiring data engineers, data architects, data analysts, and IT professionals who want to learn how to design, build, and manage data solutions on the Azure platform.
Basic knowledge of databases, SQL, and cloud computing concepts is recommended. Some familiarity with data analysis and programming languages like Python or SQL can be beneficial.
The course covers core data concepts, data storage solutions, ETL pipeline design, data integration and orchestration, data modeling, performance monitoring, data security and compliance, and big data solutions.
The course duration varies depending on the provider and the learning format (self-paced, instructor-led, or blended). Typically, it can range from a few weeks to a couple of months.
Yes, this course is designed to prepare you for the Microsoft Certified: Azure Data Engineer Associate certification exam by covering all relevant topics and providing practice exams.
Yes, the course includes hands-on labs, projects, and real-world scenarios to help you apply theoretical knowledge and gain practical experience in using Azure data services.
Completing this course will equip you with the skills to design and implement data solutions on Azure, improve your employability in the data engineering field, and prepare you for industry-recognized certification.
The course can be delivered in various formats, including online self-paced learning, live instructor-led training, and hybrid models that combine both approaches.
Students typically have access to resources such as online forums, instructor Q&A sessions, technical support, and study materials to assist them throughout the course.
Certifications
Microsoft Certified: Azure Data Engineer Associate
- Exam: DP-203: Data Engineering on Microsoft Azure
- Description: This certification validates your skills in integrating, transforming, and consolidating data from various structured and unstructured data systems into structures suitable for building analytics solutions.
Microsoft Certified: Azure Fundamentals
- Exam: AZ-900: Microsoft Azure Fundamentals
- Description: While not specific to data engineering, this entry-level certification covers the basics of cloud services and how those services are provided with Microsoft Azure, providing a good foundation for further Azure certifications.
Microsoft Certified: Azure Data Fundamentals
- Exam: DP-900: Microsoft Azure Data Fundamentals
- Description: This certification demonstrates foundational knowledge of core data concepts and how they are implemented using Microsoft Azure data services. It is a good starting point before moving on to more advanced data engineering certifications.
Enroll Free Demo Class
Have Any Questions ?
- 521 Dyson Rd HainesCity FL 33844
- info@rbcloudgenx.com
- +1 8043007153
Prerequisites
There are no mandatory prerequisites for learning Azure Data Engineering, but having basic knowledge or experience in the data warehouse and SQL is an added advantage.
Our Other Courses
The AWS Data Engineering course is designed to provide in-depth knowledge and practical skills required to build, maintain, and optimize data pipelines.
In this Azure Data Engineering training course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure.
RBCloudGenX’s Snowflake training Online is aligned with the latest curriculum of the Snowflake certification exam.
RBCloudGenX Databricks course is designed to equip learners with the knowledge and skills necessary to work with Apache Spark and Databricks.
Rate This Course !
Click on a star to rate it!
Average rating 2.3 / 5. Vote count: 3
No votes so far! Be the first to rate this post.