1. Home
  2. Microsoft
  3. DP-750 Exam Info
  4. DP-750 Exam Questions

Master Microsoft DP-750: Azure Databricks Data Engineering Prep

Breaking into cloud data engineering demands more than ambition—it requires battle-tested preparation that mirrors real-world Azure Databricks challenges. Our DP-750 practice materials transform exam anxiety into confidence through three flexible formats: downloadable PDFs for offline study sessions, web-based platforms for instant accessibility, and desktop software featuring timed simulations. Each question reflects the intricate scenarios you'll face managing Delta Lake architectures, orchestrating ETL pipelines, and optimizing Spark workloads—skills that hiring managers at Fortune 500 companies actively seek. Thousands of data professionals have already accelerated their journey from database administrators to cloud data engineers using our continuously updated question banks. Whether you're commuting, between meetings, or dedicating focused weekend hours, you'll experience exam conditions that eliminate surprises on test day. Stop second-guessing your readiness and start practicing with materials designed by certified Azure architects who understand exactly what separates passing scores from career-launching excellence.

Question 1

You have an Azure Databricks workspace that is enabled for Unity Catalog

You have a complex job named Job1 that contains eight tasks. Job! takes multiple hours to complete

During the last job run, the final task fails due to a transient issue.

You need to retry the last task without rerunning tasks that have already completed.

What should you do?


Correct : B

CORRECT ANSWE R: B - Repair the current job run.

According to Microsoft Learn on Lakeflow Jobs run repair, the 'Repair Run' feature allows you to re-run only the failed task (and any tasks that depend on it) within an existing job run, while skipping all tasks that already completed successfully. This directly satisfies: 'retry the last task without rerunning tasks that have already completed.' For a complex job with eight tasks that takes multiple hours, re-running from scratch (Option C - Restart) would waste significant time and compute resources. Option A (update job parameters) changes parameters for future runs but doesn't re-execute the failed task. Option D (disable and re-enable the schedule) creates a new run from the beginning rather than repairing the existing run.


Options Selected by Other Users:
Mark Question:

Start a Discussions

Submit Your Answer:
0 / 1500
Question 2

You have a Lakeflow Spark Declarative Pipelines {SDP) pipeline in Azure Databricks. The pipeline ingests transaction data into a table named Table1.

You need to ensure that in the event of an invalid record, the pipeline continues to run. The solution must meet the following requirements:

* Invalid records must NOT be written to Table 1.

* Invalid records must be preserved for review.

* Minimize development effort

What should you do?


Correct : B

CORRECT ANSWE R: B - Define a pipeline expectation.

According to Microsoft Learn on Lakeflow Spark Declarative Pipelines (SDP) data quality expectations, defining a pipeline expectation with the @dlt.expect_or_drop decorator is the correct approach when the pipeline must continue running, invalid records must NOT be written to the target table, and invalid records must be preserved for review. SDP automatically tracks dropped records as expectation metrics in the pipeline event log, which satisfies the 'preserve for review' requirement with minimal development effort. Option A (advanced quarantine logic) requires additional development effort to implement custom routing. Option C (WHERE clauses in downstream queries) filters records at query time rather than at ingestion, meaning invalid records would still be written to Table1. Option D (check constraint on Table1) would cause write failures and stop the pipeline, violating the 'pipeline continues to run' requirement.


Options Selected by Other Users:
Mark Question:

Start a Discussions

Submit Your Answer:
0 / 1500
Question 3

You have an Azure Databricks workspace that is attached to a Unity Catalog metastore named metastore1. Metastore1 contains a catalog named catalog 1.

You need to create a new schema named schema2 that meets the following requirements:

* Is contained in catalog1

* Uses abfss://containergstorageaccount.dfs.core.windows.net/data as the Managed location

Which SQL statement should you execute?


Correct : A

CORRECT ANSWE R: A - CREATE SCHEMA catalog1.schema2 MANAGED LOCATION 'abfss://container@storageaccount.dfs.core.windows.net/data';

According to Microsoft Learn on Unity Catalog schema management, the correct DDL syntax to create a schema within a specific catalog and set a custom managed storage location is: CREATE SCHEMA <catalog>.<schema> MANAGED LOCATION ''. The three-part naming convention (catalog.schema) explicitly places the new schema inside catalog1. The MANAGED LOCATION clause specifies where Unity Catalog will store the managed tables and volumes created under this schema. Option B is incorrect because CREATE CATALOG would create a new catalog, not a schema, and schema2 would become a catalog name. Option C is incorrect because LOCATION (without MANAGED) is used for external locations, not managed schema storage paths. Option D is incorrect because WITH DBPROPERTIES is used for custom metadata key-value pairs, not for specifying storage paths.


Options Selected by Other Users:
Mark Question:

Start a Discussions

Submit Your Answer:
0 / 1500
Question 4

You use Databricks Asset Bundles to manage two jobs and an app.

You need to deploy the bundle to development and production environments. The solution must meet the following requirements

* Deploy the app to both environments.

* Deploy only one job to development.

* Minimize administrative effort.

What should you use?


Correct : D

CORRECT ANSWE R: D - A targets node in a databricks.yml file.

According to Microsoft Learn on Databricks Asset Bundles (DAB), the targets node in databricks.yml defines environment-specific configurations (development, staging, production). Within each target, you can override resource inclusion using the include/exclude mechanism or resource-level overrides. The requirement to 'deploy the app to both environments' and 'deploy only one job to development' with 'minimize administrative effort' is best achieved through a single databricks.yml with a targets node --- where the development target excludes or overrides one of the jobs. Option A (resources node) defines all resources but doesn't handle environment-specific filtering. Option B (separate databricks.yml files) requires maintaining multiple files and increases administrative effort. Option C (variables node) handles parameterization but not resource inclusion/exclusion.


Options Selected by Other Users:
Mark Question:

Start a Discussions

Submit Your Answer:
0 / 1500
Question 5

You have an Azure Databricks workspace that is enabled for Unity Catalog

You have an Apache Spark Structured Streaming job that writes data to a Delta table.

After the cluster restarts, the streaming job reprocesses previously ingested data

You need to prevent the streaming job from reprocessing the data after the cluster restarts.

What should you do?


Correct : B

CORRECT ANSWE R: B - Configure a checkpoint location for the streaming query.

According to Microsoft Learn on Apache Spark Structured Streaming, checkpointing is the mechanism that enables fault tolerance and exactly-once processing semantics. The checkpoint stores the streaming query's progress --- including the offset of the last successfully processed batch --- in a durable storage location (typically ADLS Gen2 or DBFS). When the cluster restarts, the streaming query reads the checkpoint to determine the last committed offset and resumes from that point, preventing reprocessing of already-ingested data. Option A (increase trigger interval) controls how frequently micro-batches run but does not prevent reprocessing on restart. Option C (watermark) handles late-arriving data in event-time processing but does not prevent reprocessing on restart. Option D (enable CDF) tracks changes to a Delta table but does not affect streaming source offset management.


Options Selected by Other Users:
Mark Question:

Start a Discussions

Submit Your Answer:
0 / 1500
Page:    1 / 12   
Total 58 questions