The Data Vault Guru
- Hannah Dowse
- Mar 10, 2021
- 2 min read
Australia is renowned as a surfer’s paradise, and Patrick Cuba’s recently published book ‘The Data Vault Guru: A Pragmatic Guide on Building a Data Vault,’ is certainly attracting a wave of interest.
There are not many guides to building Data Vault solutions out there, so it is always interesting when a new one comes along. As in all walks of business, and life generally, there is always room for perspectives based on different experiences. Patrick’s book is a catch up and covers many of the latest developments in Data Vault.
Patrick, who moved to Sydney, in Australia, having studied in South Africa, has worked for a number of large financial institutions and carried out projects in the telecommunications industry.
His practical experiences using Data Vault have catered for everything from preventing fraud to aiding debt collection. Patrick explained: “I have always been data-focused in my solutions where I work to understand the business reason for automation – and reducing the number of ‘hops’ from business event to business analytic value.”
He also cut his teeth in developing “bleeding edge solutions” before settling on the tried and tested methodology of Data Vault 2.0. In an big project he designed and developed an automated platform based on SAS.
In 2019, he led the design of a Data Vault using automation with Big Data. It required innovating using Amazon Work Space and Apache Spark.
Then came the opportunity to put those experiences into writing his book.
Patrick revealed: “What I saw was that I had the relevant experience even though it was in architecting SAS solutions that were transferable to modern data analytic platforms – even though I didn’t have to code in it.
“Through my work I always look to simplify the solution. The motto: ‘If you can’t explain what you do in a few sentences you don’t know the subject well enough‘ rings true in my work.
“So I took what I had been doing all along – building data-driven solutions – and applied that to how I would build a Data Vault, whether it is on Big Data or not.
“The book takes that approach – lays down some key, pun intended, understanding about data warehousing around business keys and the different perspectives of time, before learning how to build a data vault based on those understandings.
“After all, you shouldn’t be building an analytic platform without understanding the business first.”
The book then looks at:
Raw and business vault – standards and modelling through business scenarios, mastering keys and reference data
How to automate each pattern through landing, staging, loading and an automated test framework under orchestration
Introducing a data-driven framework for timeline correction, for the dreaded batch load that arrives out of sequence.
The oft-overlooked area of Data Vault – how to query through automation patterns and query-assistance tables.
And finally, the book delves into additional vault structures around your business data – as well as modelling approaches and checklists for rating data vault models and building your own automation framework.
