DP-200: Implementing an Azure Data Solution

Notes

Implement data storage solutions (40-45%)

Implement non-relational data stores

  • implement a solution that uses Cosmos DB, Data Lake Storage Gen2, or Blob storage
  • implement data distribution and partitions
  • implement a consistency model in CosmosDB
  • provision a non-relational data store
  • provide access to data to meet security requirements
  • implement for high availability, disaster recovery, and global distribution

Implement relational data stores

  • configure elastic pools
  • configure geo-replication
  • provide access to data to meet security requirements
  • implement for high availability, disaster recovery, and global distribution
  • implement data distribution and partitions for SQL Data Warehouse
  • Implement PolyBase

Manage data security

  • implement data masking
  • encrypt data at rest and in motion

Manage and develop data processing (25-30%)

Develop batch processing solutions

  • develop batch processing solutions by using Data Factory and Azure Databricks
  • ingest data by using PolyBase
  • implement the integration runtime for Data Factory
  • create linked services and datasets
  • create pipelines and activities
  • create and schedule triggers
  • implement Azure Databricks clusters, notebooks, jobs, and autoscaling
  • ingest data into Azure Databricks

Develop streaming solutions

  • configure input and output
  • select the appropriate windowing functions
  • implement event processing using Stream Analytics

Monitor and optimize data solutions (30-35%)

Monitor data storage

  • monitor relational and non-relational data sources
  • implement BLOB storage monitoring
  • implement Data Lake Store monitoring
  • implement SQL Database monitoring
  • implement SQL Data Warehouse monitoring
  • implement Cosmos DB monitoring
  • configure Azure Monitor alerts
  • implement auditing by using Azure Log Analytics

Monitor data processing

  • design and implement Data Factory monitoring
  • monitor Azure Databricks
  • monitor HDInsight processing
  • monitor stream analytics

Optimize Azure data solutions

  • troubleshoot data partitioning bottlenecks
  • optimize Data Lake Storage
  • optimize Stream Analytics
  • optimize SQL Data Warehouse
  • optimize SQL Database
  • manage data life cycle

Terminology

  • Non-relational Data Stores
    • Cosmos DB
    • Data Lake Storage Gen 2
    • Blob Storage
  • Relational Data Store
    • Elastic Pools
    • Geo-Replication
  • Data Masking
  • Data Factory
  • Azure Databricks
  • PolyBase
  • Stream Analytics
  • Azure Data Explorer
  • Monitoring Data Storage
    • Blob Storage Monitoring
    • Data Lake Storage Monitoring
    • SQL Database Monitoring
    • Azure Synapse Analytics Monitoring
    • Cosmos DB Monitoring
    • Azure Data Explorer Monitoring
  • Azure Monitor
  • Azure Log Analytics
  • Data Factory Pipelines
  • Azure Synapse Analytics
  • Data Lifecycle

Blogs

Training

Videos

References