DP203T00: Data Engineering on Microsoft Azure

Course DateStart TimeEnd TimeTime ZoneLocationDaysPrice
Call for In Class or Live Virtual Dates4$2,380 USDPurchase


DP-203T00

Data Engineering on Microsoft Azure

In this course, you will learn about data engineering as it pertains to working with batch and real-time analytical solutions using Azure data platform technologies. You will begin by understanding the core compute and storage technologies that are used to build an analytical solution. You learn how to interactively explore data stored in files in a data lake. You will then be able to create a real-time analytical system to create real-time analytical solutions.

Duration: 4 Days

Prerequisites

  • Knowledge of cloud computing and core data concepts
  • Professional experience with data solutions
  • AZ-900 - Azure Fundamentals course
  • DP-900 - Microsoft Azure Data Fundamentals course

Audience

  • Data Professionals, Data Architects, and Business Intelligence Professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure
  • Data Analysts and Data Scientists who work with analytical solutions built on Microsoft Azure

Learning Objectives

  • Combine streaming and batch processing with a single pipeline
  • Index data lake storage for query and workload acceleration
  • Query Parquet data with serverless SQL pools
  • Create views with serverless SQL pools and external tables for Parquet and CSV files
  • Secure access to data in a data lake when using serverless SQL pools
  • Configure data lake security using Role-Based Access Control and Access Control List
  • Cache a DataFrame for faster subsequent queries
  • Aggregate data stored in a DataFrame
  • Perform Data Exploration in Synapse Studio
  • Ingest data with Spark notebooks in Azure Synapse Analytics
  • Perform petabyte-scale ingestion with Azure Synapse Pipelines
  • Import data with PolyBase and COPY using T-SQL
  • Execute code-free transformations at scale with Azure Synapse Pipelines
  • Create data pipeline to import poorly formatted CSV files
  • Secure Azure Synapse Analytics supporting infrastructure, managed services, and workspace and workspace data
  • Query Azure Cosmos DB with Apache Spark and severless SQL pool for Azure Synapse Analytics
  • Use Stream Analytics to process real-time data from Event Hubs
  • Scale the Azure Stream Analytics job to increase throughput through partitioning
  • Stream data from a file and write it out to a distributed file system
  • Use sliding windows to aggregate over chunks of data rather than all data
  • Apply watermarking to remove stale data
  • Connect to Event Hubs read and write streams

Topics

  • Explore Compute and Storage Options for Data Engineering Workloads
  • Run Interactive Queries Using Azure Synapse Analytics Serverless SQL Pools
  • Data Exploration and Transformation in Azure Databricks
  • Explore, Transform, and Load Data into the Data Warehouse Using Apache Spark
  • Ingest and Load Data into the Data Warehouse
  • Transform Data with Azure Data Factory or Azure Synapse Pipelines
  • Orchestrate Data Movement and Transformation in Azure Synapse Pipelines
  • End-To-End Security with Azure Synapse Analytics
  • Support Hybrid Transactional Analytical Processing with Azure Synapse Link
  • Real-time Stream Processing with Stream Analytics
  • Create a Stream Processing Solution with Event Hubs and Azure Databricks
Right Menu IconMENU