Date
Aug 2025
The Challenge
The University of Western Australia (UWA) has embarked on a bold data platform modernisation strategy in readiness for the broad scale adoption of Artificial Intelligence (AI) and Machine Learning (ML). This required them to re-centre their data analytics capability around Databricks, turning Databricks from being just a tool to becoming their main data platform, and enabling them to take advantage of the latest functionality that the platform offers.
The Solution
Mechanical Rock, in close collaboration with UWA IT’s Business Intelligence & Analytics (BI&A) team, designed and implemented a complete vertical slice of a modern Lakehouse Architecture on Databricks, centered on a specific use case, in order to provide a reference implementation that can be scaled across other data domains and future use cases.
Migration of the selected use case to Lakehouse architecture
To deliver the selected use case, the team built ingestion and transformation workflows using Databricks while taking advantage of modern tooling to simplify the data engineering workflow:
- Modernisation of data ingestion flows using Delta Live Tables
- Adoption of data build tool (dbt) to standardise and simplify enterprise data warehousing, in line with leading industry practice
- Application of infrastructure-as-code (terraform) and Databricks Asset Bundles in CI/CD pipelines to automate and speed-up deployments
- Orchestration of complex workflows with Databricks Lakeflow Jobs
- Ensured historic data was reprocessed to faithfully recreate type 2 slowly changing dimensions (SCDs)
- Introduction of comprehensive data and schema validation testing
Eased migration with re-usable Historic Backfill Job
The backfill job has been created using native Databricks tooling (Workflows) and works in a very simple way. It is easily testable and repeatable by data engineers in any of the Databricks environments. The job was used in the selected use case to construct change dimensions that record how entities have changed over time, essential to Kimball Modelling.
Power BI Dashboards and Semantic Models served directly from Lakehouse Architecture
This project pioneered the use of the Lakehouse platform to serve gold layer certified data sets from Power BI:
- Use of dbt to annotate data models with relationship constraints to power the semantic model
- Metadata from dbt is persisted through to the semantic model
- Enablement of serverless SQL warehouses to publish gold layer data sets on demand
- Configuration of connection and credentials to support automated scheduled refresh
Enabled playpen environment for business users
Providing designated business users a single source of truth, via Unity Catalog, enables an AI-ready experience for citizen data scientists. Security and administration of the playpen is self-managed by the UWA BI&A team.
The Benefits
The implementation of the modern Lakehouse Architecture for the selected use case, and the accompanying capability uplift, provides the critical launching-pad for UWA to realise their data-enabled innovation ambitions:
- Future-proofing for Artificial Intelligence driven innovation: Through a single, scalable data platform, UWA has a rich trusted data set on which it can confidently enable advanced use cases for AI and ML. The playpen environments also provide citizen data scientists easy access to this valuable data, while maintaining the integrity of the university’s data-driven operations.
- Extension of the solution beyond the selected use case: Mechanical Rock established a reusable architecture that could be extended beyond the selected use case. During the engagement, members of the UWA BI&A team had already started to extend the architecture to additional domains.The single Lakehouse repository offered a clear layout to support this and future extension.
- Overall process simplification: The new architecture simplifies many processes, including schema management, now managed and evolved by Databricks DLT at the raw layer; and simpler transformation code, focusing on meaningful business rules (semantics), letting dbt take care of merge and update. Databricks workspace consistency was also improved through infrastructure-as-code , and all platform documentation now sits with the code and is promoted to Unity Catalog. The resulting extract-transform-load (ETL) cycle is faster than it was in the previous implementation: for example the CI validation step alone is 94% faster.
- Increased visibility of underlying platform costs: The pre-existing Azure costs management breakdown was limited to the workspace level, leaving little insight into individual workload and interactive usage costs. The introduction of costs dashboards at the account and workspace level introduced fine grained introspection to enable better cost optimisation opportunities.
- Upskilling and capability build: Through the engagement the team gained a better understanding of key data engineering technologies that are becoming standard industry wide. This included the upskilling in dbt to effect data transformations in Databricks, the introduction to structured data architecture; and the adoption of many development best practices, including git integration of all core development activities, converting manually managed CI/CD pipelines into YAML pipelines as code, and the management of resources as code using Infrastructure as Code terraform and Databricks Asset Bundles.
Mechanical Rock's engagement with UWA led to the successful implementation of a robust, scalable and secure reference Lakehouse Architecture for their data platform. The UWA IT Business Intelligence & Analytics team has what they need to continue to extend the Lakehouse Architecture to the other domains within their scope. This work will further strengthen and deepen their data engineering capabilities, and creates the opportunity to extract even more value from their data platform investment as they adopt Artificial Intelligence and Machine Learning capabilities.