Microsoft Azure Databricks for Data Engineering

Por: Coursera . en: , ,

  • Introduction to Azure Databricks
    • Describe the capabilities of Azure Databricks and the Apache Spark notebook for processing huge files. Describe the Azure Databricks platform and identify the types of tasks well-suited for Apache Spark. Describe the architecture of an Azure Databricks Spark Cluster and Spark Jobs.
  • Read and write data in Azure Databricks
    • Describe how to use Azure Databricks supports day-to-day data-handling functions, such as reads, writes, and queries.
  • Data processing in Azure Databricks
    • Process data in Azure Databricks by defining DataFrames to read and process the Data. Perform data transformations in DataFrames and execute actions to display the transformed data. Explain the difference between a transform and an action, lazy and eager evaluations, Wide and Narrow transformations, and other optimizations in Azure Databricks.
  • Work with DataFrames in Azure Databricks
    • Use the DataFrame Column Class Azure Databricks to apply column-level transformations, such as sorts, filters and aggregations. Use advanced DataFrame functions operations to manipulate data, apply aggregates, and perform date and time operations in Azure Databricks.
  • Platform architecture, security, and data protection in Azure Databricks
    • Describe the Azure Databricks platform architecture and how it is securedUse Azure Key Vault to store secrets used by Azure Databricks and other services. Access Azure Storage with Key Vault-based secrets
  • Delta Lake
    • Describe how to use Delta Lake to create, append, and upsert data to Apache Spark tables, taking advantage of built-in reliability and optimizations. Describe Azure Databricks Delta Lake architecture
  • Analyze streaming data and create production workloads
    • Process streaming data with Azure Databricks structured streaming. Create production workloads on Azure Databricks with Azure Data Factory.
  • Create a data architecture
    • Describe how to put Azure Databricks notebooks under version control in an Azure DevOps repo and build deployment pipelines to manage your release process. Describe how to integrate Azure Databricks with Azure Synapse Analytics as part of your data architecture. Describe best practices for workspace administration, security, tools, integration, databricks runtime, HA/DR, and clusters in Azure Databricks
  • Practice Exam on Data engineering with Azure Databricks
    • Prepare for the Microsoft Certified: Azure Data Engineer Associate exam

Plataforma