If you are on Databricks and not using Asset Bundles, you are missing out. If you are on a different data platform - you are also missing out, and probably wish you were on Databricks. None of the other platforms have anything quite equivalent that makes development and deployment workflows this smooth.
Databricks Asset Bundles (DABs) allow you to deploy code, workflows, and workspace-level resources as a single unit by declaratively defining them in YAML files. They bridge the gap between higher-level infrastructure management (where tools like Terraform or OpenTofu shine) and UI-driven development in the Databricks workspace.
And let’s be honest - YAML is king these days!
Some of the killer features in DABs:
Targets
Targets enable reproducible deployments across multiple environments, very similar to dbt profiles. The same bundle can be deployed to dev, test, or prod with environment-specific configuration.
Variables
Variables let you control deployment behaviour per environment. For example, you might deploy a pipeline in a paused state in dev, but automatically unpause it in production.
Deployment mode (multi-user development)
In development mode, multiple users can work on the same bundle without stepping on each other’s toes. Databricks automatically prefixes deployed resources with the developer’s username, making it easy to distinguish who owns what.
Notebooks as Python or SQL files
One of the most underrated features: you can write notebooks as .py or .sql files, as long as they include the Databricks notebook header.
Example sample_task.py:
# Databricks notebook source
# COMMAND ----------
# DBTITLE 1,Set the parameters
dbutils.widgets.text("environment","dev")
env = dbutils.widgets.get("environment")
# COMMAND ----------
# DBTITLE 1,Create Schemas
# Create control schema in bronze catalog
spark.sql(f"create schema if not exists {env}_bronze.control")
And the job definition that calls this notebook:
resources:
jobs:
sample_job:
name: sample_job
email_notifications:
on_failure:
- ${var.email_notifications}
tasks:
- task_key: sample_task
notebook_task:
notebook_path: "sample_task.py"
This approach gives you the best of both worlds:
- Clean, git-friendly Python and SQL files
- Full notebook behaviour inside Databricks
- Easy deployment via Asset Bundles
Config-driven development
DABs encourage truly config-driven workflows. You can build reusable Python logic that reads from configuration files defining tables, sources, or pipelines. Adding a new table often means changing config only - no need to touch the core code.
Permission management
Asset Bundles can manage permissions alongside resource creation, both at the workspace level and per resource. Example workspace-level permissions in a target:
targets:
dev:
workspace:
host: https://xxx.cloud.databricks.com/
root_path: /Shared/.bundle/${bundle.name}/${bundle.target}
permissions:
- group_name: dev_admin_group
level: CAN_MANAGE
- group_name: dev_developer_group
level: CAN_VIEW
Example workspace-level permissions in a target:
resources:
dashboards:
pipelines_dashboard:
display_name: Pipelines Dashboard
warehouse_id: ${var.warehouse_id}
file_path: config/pipelines_dashboard.lvdash.json
embed_credentials: true
permissions:
- group_name: ${var.env_code}_developer_group
level: CAN_RUN
- group_name: ${var.env_code}_admin_group
level: CAN_RUN
Validation
databricks bundle validate checks your YAML for syntax and correctness before deployment. It’s ideal for CI/CD pipelines to prevent configuration errors from sneaking in.
This validation complements - but does not replace - testing and validation of your SQL and Python logic.
UI export
You don’t need to write everything from scratch. Many supported resources can be created (or partially created) in the Databricks UI and then exported to YAML. You’ll usually need to replace hard-coded values with variables, but it’s a great way to experiment and discover available parameters.
Custom bundle templates
Beyond Databricks’ starter templates, you can create your own. This is especially useful in larger organisations where you want to enforce standards and reuse common patterns from day one.
What we’d like to see next
Selective deployment
Something akin to dbt’s --select, allowing only specific resources to be deployed from a bundle. This would be particularly useful in development.
Conditional logic or templating
Native Jinja support or conditional resource creation would unlock even more flexibility.
Getting started
Getting started with Asset Bundles is refreshingly simple.
Follow one of the official Databricks tutorials.
All you need is a Databricks account and a terminal.
databricks bundle init
This launches a friendly wizard that helps you choose a template and configure your project.
Before deploying, validate your bundle:
databricks bundle validate
Deploy to dev:
databricks bundle deploy --target dev --profile dev
Run a specific job or pipeline:
databricks bundle run --target dev job_name
Destroy resources when you’re done:
databricks bundle destroy --target dev
Learning resources and examples
Databricks provides a solid and continually growing set of resources to help you get up to speed with Asset Bundles.
The current list of supported bundle resources is available in the official documentation. New resource types are added regularly, so it’s worth checking back as Asset Bundles continue to evolve.
In addition, Databricks maintains a set of sample Asset Bundles that demonstrate common patterns and best practices across different use cases These examples are a great reference when you’re trying to understand how a particular resource is defined in YAML or how multiple resources fit together in a real-world bundle.
Best practice tips
- Use one workspace per environment (e.g. dev, test, prod)
- Split workloads across multiple Asset Bundle repos as your platform grows (e.g. ingestion, control logic, data modelling)
- Deploy to dev locally; deploy to higher environments via CI/CD
- When introducing new resource types, prototype them in the UI first and export to YAML
Takeaways
Databricks Asset Bundles are a powerful addition to the Databricks ecosystem. They simplify development workflows, encourage best practices, and make collaboration significantly easier. Combined with source control and Terraform for underlying infrastructure, they complete the Databricks resource management story.
Here at Mechanical Rock, we love working with Databricks. If you have Databricks workflows or challenges you’d like help with, please don’t hesitate to get in touch!
