We recently helped a client escape a painful Redshift anti-pattern: AWS EventBridge triggers firing Lambda functions that invoked stored procedures to fully refresh all silver table data. No incrementality. No intelligence. Just brute force computation on a schedule, regardless of whether source data had changed.

When we migrated them to Databricks, we replaced this entire orchestration mess with a single feature: materialized views.

The old way: orchestration overhead

The Redshift architecture looked something like this:

EventBridge Rule (cron) 
  → Lambda Function
    → Redshift Stored Procedure
      → Full refresh of silver tables
        → Hours of unnecessary computation

Every trigger meant:

Re-reading all source data
Re-computing all transformations
Overwriting all target data
Paying for compute whether data changed or not

Add multiple tables to this pattern and you've got a scheduling nightmare. Which Lambda runs first? What if one fails? How do you handle dependencies? You end up building a state machine or Step Function just to orchestrate basic data refreshes.

The new way: let Databricks do the work

With Databricks materialized views, the entire orchestration layer disappears. Here's a real dbt example from the migration:

{{
    config(
        materialized='materialized_view',
        schedule = none if target.name == 'dev' else {
            'cron': '0 0 */2 ? * *'
        },
        alias = 'cycle_summaries'
    )
}}

SELECT 
    cycle_id,
    equipment_id,
    start_time,
    end_time,
    DATEDIFF(minute, start_time, end_time) as duration_minutes,
    total_volume,
    avg_pressure
FROM {{ ref('bronze_cycles') }}
WHERE status = 'COMPLETED'

That's it. No Lambda. No EventBridge. No stored procedure. The materialized view:

Automatically detects when source data changes (feature in progress)
Incrementally refreshes only affected rows
Runs on the schedule you define in the view itself
Manages dependencies through dbt's ref() function

Key benefits we observed

1. Incremental by default Databricks materialized views track which source rows have changed and only recompute affected downstream rows. In our client's case, this reduced refresh times from hours to minutes.

2. Schedule in the dbt file The schedule lives with the dbt view definition, versioned in git:

schedule = {
    'cron': '0 0 */2 ? * *'  # Every 2 hours
}

Development environments can disable automatic refreshes entirely:

schedule = none if target.name == 'dev' else { ... }

This pattern is gold for multi-environment deployments.

3. No external orchestration Databricks handles the scheduling, dependency resolution, and refresh logic. You write SQL. The platform does the rest.

4. Built-in observability Every materialized view exposes metadata about refresh history, duration, and row counts through system tables. No need to instrument Lambda functions or parse CloudWatch logs.

5. Cost efficiency You only pay for compute when data actually changes. Contrast this with EventBridge triggers that fire regardless of whether there's new data to process.

Materialized views aren't a silver bullet:

Complex aggregations: Very heavy transformations may still be better as scheduled jobs
Streaming requirements: If you need sub-minute latency, consider Delta Live Tables streaming
Cross-catalog queries: Materialized views must reference tables in the same metastore
Schema evolution: Adding columns requires view recreation

For these cases, you might still reach for workflows or DLT pipelines. But for the 80% of teams whose use case is "refresh this table when source data changes," materialized views are unbeatable

Migration strategy

Here's how we approached the Redshift → Databricks migration:

Inventory existing stored procedures - Understand what each one actually does
Identify full vs incremental - Which refreshes can be incremental?
Convert to dbt models - Start with standard views or tables
Test refresh logic - Validate incremental behaviour in dev
Switch to materialized views - Enable scheduling once validated
Monitor and tune - Use Databricks system tables to track performance
Decommission orchestration - Remove Lambda functions and EventBridge rules

The client went from Lambda functions and EventBridge rules to zero external orchestration. Their data freshness actually improved because Databricks refreshes are faster and more frequent.

Takeaways

Materialized views in Databricks represent a fundamental shift in how we think about data transformation scheduling. Instead of building orchestration around data pipelines, we let the data platform handle it natively.

For teams migrating from Redshift, or legacy warehouses with complex scheduling logic, this is a game-changer. Less code. Less infrastructure. Faster refreshes. Lower costs.

If you're drowning in EventBridge rules, Lambda functions, and full-table refreshes, it might be time to consider Databricks materialized views.

At Mechanical Rock, we specialise in modernising data platforms and migration strategies. If you're considering a move to Databricks or want to optimise your existing workloads, let's talk!

Getting started

If you're using dbt with Databricks, materialized views are trivial to adopt:

{{
    config(
        materialized='materialized_view',
        schedule = {'cron': '0 * * * *'}  # Hourly
    )
}}

SELECT ... FROM {{ ref('source_table') }}

One thing to keep in mind is that Databricks uses quartz cron for it's scheduling. It's also worthwhile to keep a lookout for any future Databricks updates - currently the TRIGGER ON UPDATE schedule is in beta but once productionised it could further simplifiy materialized views scheduling.

Databricks Materialized Views: Fix your orchestration chaos