<< Back to all Blogs
An Ode to Databricks Asset Bundles

An Ode to Databricks Asset Bundles

Maciej Tarsa

If you are on Databricks and not using Asset Bundles, you are missing out. If you are on a different data platform - you are also missing out, and probably wish you were on Databricks. None of the other platforms have anything quite equivalent that makes development and deployment workflows this smooth.

Databricks Asset Bundles (DABs) allow you to deploy code, workflows, and workspace-level resources as a single unit by declaratively defining them in YAML files. They bridge the gap between higher-level infrastructure management (where tools like Terraform or OpenTofu shine) and UI-driven development in the Databricks workspace.

And let’s be honest - YAML is king these days!

Some of the killer features in DABs:

Targets

Targets enable reproducible deployments across multiple environments, very similar to dbt profiles. The same bundle can be deployed to dev, test, or prod with environment-specific configuration.

Variables

Variables let you control deployment behaviour per environment. For example, you might deploy a pipeline in a paused state in dev, but automatically unpause it in production.

Deployment mode (multi-user development)

In development mode, multiple users can work on the same bundle without stepping on each other’s toes. Databricks automatically prefixes deployed resources with the developer’s username, making it easy to distinguish who owns what.

Notebooks as Python or SQL files

One of the most underrated features: you can write notebooks as .py or .sql files, as long as they include the Databricks notebook header. Example sample_task.py:

# Databricks notebook source

# COMMAND ----------
# DBTITLE 1,Set the parameters
dbutils.widgets.text("environment","dev")
env = dbutils.widgets.get("environment")

# COMMAND ----------
# DBTITLE 1,Create Schemas
# Create control schema in bronze catalog
spark.sql(f"create schema if not exists {env}_bronze.control")

And the job definition that calls this notebook:

resources:
 jobs:
   sample_job:
     name: sample_job
     email_notifications:
       on_failure:
         - ${var.email_notifications}
     tasks:
       - task_key: sample_task
         notebook_task:
           notebook_path: "sample_task.py"

This approach gives you the best of both worlds:

  • Clean, git-friendly Python and SQL files
  • Full notebook behaviour inside Databricks
  • Easy deployment via Asset Bundles

Config-driven development

DABs encourage truly config-driven workflows. You can build reusable Python logic that reads from configuration files defining tables, sources, or pipelines. Adding a new table often means changing config only - no need to touch the core code.

Permission management

Asset Bundles can manage permissions alongside resource creation, both at the workspace level and per resource. Example workspace-level permissions in a target:

targets:
 dev:
   workspace:
     host: https://xxx.cloud.databricks.com/
     root_path: /Shared/.bundle/${bundle.name}/${bundle.target}
   permissions:
     - group_name: dev_admin_group
       level: CAN_MANAGE
     - group_name: dev_developer_group
       level: CAN_VIEW

Example workspace-level permissions in a target:

resources:
 dashboards:
   pipelines_dashboard:
     display_name: Pipelines Dashboard
     warehouse_id: ${var.warehouse_id}
     file_path: config/pipelines_dashboard.lvdash.json
     embed_credentials: true
     permissions:
       - group_name: ${var.env_code}_developer_group
         level: CAN_RUN
       - group_name: ${var.env_code}_admin_group
         level: CAN_RUN

Validation

databricks bundle validate checks your YAML for syntax and correctness before deployment. It’s ideal for CI/CD pipelines to prevent configuration errors from sneaking in. This validation complements - but does not replace - testing and validation of your SQL and Python logic.

UI export

You don’t need to write everything from scratch. Many supported resources can be created (or partially created) in the Databricks UI and then exported to YAML. You’ll usually need to replace hard-coded values with variables, but it’s a great way to experiment and discover available parameters.

Custom bundle templates

Beyond Databricks’ starter templates, you can create your own. This is especially useful in larger organisations where you want to enforce standards and reuse common patterns from day one.

What we’d like to see next

Selective deployment

Something akin to dbt’s --select, allowing only specific resources to be deployed from a bundle. This would be particularly useful in development.

Conditional logic or templating

Native Jinja support or conditional resource creation would unlock even more flexibility.

Getting started

Getting started with Asset Bundles is refreshingly simple. Follow one of the official Databricks tutorials.
All you need is a Databricks account and a terminal.

databricks bundle init

This launches a friendly wizard that helps you choose a template and configure your project.
Before deploying, validate your bundle:

databricks bundle validate

Deploy to dev:

databricks bundle deploy --target dev --profile dev

Run a specific job or pipeline:

databricks bundle run --target dev job_name

Destroy resources when you’re done:

databricks bundle destroy --target dev

Learning resources and examples

Databricks provides a solid and continually growing set of resources to help you get up to speed with Asset Bundles.
The current list of supported bundle resources is available in the official documentation. New resource types are added regularly, so it’s worth checking back as Asset Bundles continue to evolve.

In addition, Databricks maintains a set of sample Asset Bundles that demonstrate common patterns and best practices across different use cases These examples are a great reference when you’re trying to understand how a particular resource is defined in YAML or how multiple resources fit together in a real-world bundle.

Best practice tips

  • Use one workspace per environment (e.g. dev, test, prod)
  • Split workloads across multiple Asset Bundle repos as your platform grows (e.g. ingestion, control logic, data modelling)
  • Deploy to dev locally; deploy to higher environments via CI/CD
  • When introducing new resource types, prototype them in the UI first and export to YAML

Takeaways

Databricks Asset Bundles are a powerful addition to the Databricks ecosystem. They simplify development workflows, encourage best practices, and make collaboration significantly easier. Combined with source control and Terraform for underlying infrastructure, they complete the Databricks resource management story.

Here at Mechanical Rock, we love working with Databricks. If you have Databricks workflows or challenges you’d like help with, please don’t hesitate to get in touch!