« CASE STUDIES

Fortescue: Operational Data Lake

Bringing modern data engineering practices to incrementally transform data platform stability, maintainability and security.

The Challenge

As a top tier resources company, our client uses data as a key enabler of operational and business performance. Having previously undertaken a strategic data platform review, followed by supporting the organisation’s implementation activities, Mechanical Rock has most recently partnered the Operational Data Lake (ODL) team to continuously improve the performance, reliability, and security of its ingestion pipeline.

The Solution

Faced with significant challenges in platform stability, documentation that did not reflect actual system behaviour and declining user trust, the team took a phased, iterative approach to methodically address platform short falls and drive incremental improvement.

Codebase analysis and simplification

Through structured code analysis the team gained a much deeper understanding of the ODL system's architecture and behavior. This in turn drove the addition of targeted unit tests to validate critical logic, the removal of obsolete code, and significant code refactoring to reduce technical debt and simplify future maintenance.These improvements established a solid foundation for ongoing development.

Platform stability

By splitting Event Manager into two functions and refactoring orchestration logic, the team were able to eliminate an overly complex simultaneous processing error, significantly improving the underlying stability of the platform. They introduced safeguards to prevent concurrent executions, and fixed systemic issues causing data duplication.

Account migration and deployment improvement

The team rebuilt and migrated ODL to new AWS accounts using a trunk-based development model. All environments now share a single branch, separated by directories to support long-lived environments as required by ODL. This improved clarity, consistency, and traceability. The legacy deployment stack was replaced with a modern deployment pipeline, using continuous integration and deployment (CI/CD) and infrastructure as code (IaC).

Cost optimisation and performance improvements

Significant cost savings (approx. 94%) were achieved by re-architecting the data ingestion pipeline. The team introduced incremental data processing and replaced the legacy data integration tooling with native Snowflake functionality.

Security audit and improved identify management

Following a thorough audit of the ODL AWS platform, the team implemented OIDC-based authentication from GitHub Actions to AWS, removing the need for long-lived credentials and enabling faster, more secure deployments. Environment-level approvals were also introduced to protect production workflows.

Snowflake object management

The team implemented schemachange, an open-source tool that manages Snowflake objects using version-controlled SQL scripts stored in Git. This brought Snowflake management in line with modern infrastructure-as-code and DevOps practices, boosting resilience, maintainability, and operational confidence.

Knowledge sharing

Upskilling was a key focus to ensure the team can confidently maintain a complex codebase. Regular pair programming proved highly effective, with team members rotating between "driver" and "observer" roles to reinforce shared understanding. Mechanical Rock team members were embedded directly in the team and actively supported business-as-usual (BAU) operations, including configuring new ingestions, adding data sources, and implementing new workflows. This gave the client-side team members exposure to real scenarios while building technical confidence in a safe, collaborative setting.

Beyond stabilisation, Mechanical Rock also supported the re-architecture of a key operational tracking model, and delivered a scalable event-driven ingestion pipeline for strategic tag data, built on internal APIs.

The Benefits

Through a phased, iterative approach involving stakeholder alignment, technical re-architecture, and cultural uplift, the team delivered improvements across four key pillars:

Pipeline stability: The team streamlined data pipelines, refactored orchestration logic, and improved error handling and monitoring. Automated tests and safeguards now keep processing smooth and reliable, boosting platform performance and user confidence.

Maintainability: Pipeline architecture was simplified and standardised. Clear documentation and processes make it easier for teams to share knowledge, and a plan is in place to keep the platform efficient and scalable as it grows.

Data lead time: Optimised workflows, stronger validation, and real-time ingestion mean customers teams get insights faster and more accurately. Customer teams can now make decisions quickly, with minimal delay.

Platform security: The client has strengthened identity management using GitHub OIDC integration and continuously improves security controls. Sensitive data is protected while teams collaborate efficiently, making the platform a trusted and reliable foundation.

At the conclusion of the engagement, the Operational Data Lake had transformed into a robust and self-sufficient data platform, supported by modern software engineering practices such as continuous integration and deployment (CI/CD), infrastructure as code (IaC), automated testing, and resilient error handling, that could easily be maintained by the BAU operational team.