top of page
Laptop keyboard, coffee, sticky notes, and pencils on wood background

How conceptual Data Modelling can solve your business problems

In the Data Vault User Group’s latest meet-up, Juha Korpela – Chief Product Officer at Ellie Technologies – led with the title “Capture your business needs with conceptual data modelling”. In this blog, we will talk about how we think Conceptual Data Modelling can solve your business problems.


DATA VAULT IS FOCUSED ON THE BUSINESS

The core problem you may face in your business is the need to integrate data in a way that provides value for the enterprise. Data Vault has a solution! The Data Vault method’s business keys ensure that all data is valid across all source systems, and helps you identify the REAL business entities.

Your Data Vault MUST be business-centric rather than being source-centric. This is because source systems have their own internal data models which do or do not allign with the actual business. Integrating on business keys ensures that you can create an enterprise view to data.. But what is “the business”?


WHAT IS “THE BUSINESS”? TAXONOMIES AND ONTOLOGIES

Business entities form a hierarchical classification, known as taxonomies. Taxonomies explains the types of things within the hierarchy. Whereas ontologies explain how things within the taxonomy are related in whatever way.

In relation to Data Vault, business entities are selected from both the taxonomy and ontology of the business to form a data model, not from source system database diagrams.


In Juha’s presentation, he shared a diagram to give an example of what taxonomy and ontology look like. Here is our own iteration of what the taxonomy and ontology hierarchies look like.


The concepts for taxonomies and ontologies are not as straight forward as can be described in a few words, however. You can find out more about taxonomies and ontologies here.


WHY YOU MIGHT BE FAILING WITH DATA INTEGRATION

Data Vault initiatives are easy to fail to begin with. Here are four common reasons people fail, as highlighted by Juha:

  1. When there is existing taxonomy or ontology.

  2. When an IT project might be out of your scope of practice/ understanding.

  3. Automation tools automatically generates hubs, links, and satellites from input data – without putting the business first.

  4. Source data sets are usually readily available with all the tables and primary keys – so most people go ahead and generate!

Juha stressed the importance on not relying on automation tools as they might not be accurate to your business. Automation tools for building your Data Vault data warehouse are great, but you could face challenges if you try to solely use a tool, without being business-centric. Relying too heavily on these automation tools will give you enough output, but not enough valuable outcomes.


TYPES OF FAILURES IN A SOURCE SYSTEM DATA VAULT

As we have previously mentioned, source system Data Vault rarely works effectively. But how?

  • Source hubs are on the wrong level of the taxonomy. When the level of taxonomy is too high, then the data is too generalised. In contrary, when the taxonomy is too low, the data is too fractured to be able to produce an effective Data Vault. This also requires business logic to apply to the point of consumption repeatedly.

  • Different systems record the same data differently. You can have the same data in different systems, but they might be different. For example, system A has people as part of their “resources” table, together with cars and building (as shown in the image). System B has one table for “employees” and one table for “contractors.” Thus, there is no integration between the two systems.

  • The source system structure is highly technical and even esoteric. Overcomplicated system structures only provide businesses headaches.


HOW TO DO IT PROPERLY

Juha broke down how to do it properly with the use of Conceptual Data Modelling into seven simple steps. They are:

  1. Identify business needs – what does the business need to know to succeed?

  2. Capture the taxonomy and ontology with Conceptual Data Modelling – work with the business experts!

  3. Pick the right level of entities and relationships you need from the CDM

  4. Identify the REAL business keys

  5. Design the core structure of the Data Vault logical model

  6. Map the already-designed model to source systems

  7. …And then automate!


DIFFERENT LEVELS OF DATA MODELS

This kind of approach requires you to understand that there is a business and IT gap that needs to bridge in a way where both parties can understand. Starting from the business side of the scale going across to IT, you need to answer these questions:

  • Business glossary – find the common language / “What do we mean by X?”

  • Conceptual Data Model – What things do we need data about, and how are they related in real life?

  • Logical Data Model – What does that look like as a Data Vault?

  • Physical Data Model – How do you implement your Data Vault in this storage technology?

  • Technical architecture – How do we set up the actual technological components?


HOW TO DO CONCEPTUAL DATA MODELLING

“A Data Model is a description of a business in terms of the things it needs to know about” – Alec Sharp

That is the quote that Juha uses in every presentation about Data Modelling he does. We agree that it sums up what Data Modelling is all about.

The most important thing to consider when building your Conceptual Data Model is you are modelling a business, not a system. You need to create a model that deals the real-life things in your business – people, places, resources. In addition to this, you will need to understand how those things interact, and how they are related.

Conceptual Data Modelling requires you to work with your business experts to ensure common language. As previously mentioned, it is very much an integration of the IT team, and the business team. Furthering on from this, making it technology-agnostic – no difference regardless of systems or technology used – will ensure that everyone on your business can interpret the model.


MODELLING IN THE REAL WORLD

Throughout his presentation, Juha Korpela stressed the importance of viewing your Data Vault from a real-world perspective, rather than a system or database. The entities of Conceptual Data Models are the things that exist in that slice of reality. All relationships between the entities should be the “verbs” that describe the real business activities. A reminder – none of this is dependent on the technologies or databases used!

The only way to do this right is to collaborate with your business experts. Juha suggested simply asking them to describe how the reality works within the business, and drawing the model based on what they say.


FROM THE CDM TO THE DV MODEL

At this point you have your hubs. Then you need to create links between these based on relationships in the Conceptual Data Model – ensuring you identify the real business keys for your entities.

From here, you can create your Data Vault logical model as a separate Logical Model, or directly utilise the Conceptual Data Model in a Data Vault automation tool.


THE ROLE OF AUTOMATION AND MANAGING CHANGES

Automation and the right tooling are vital for the Data Vault design process. However, you can’t automatically create a conceptual model. That part is all about the business! After the business model is created and understood, and the Data Vault model is derived from that, automation will take care of many of the following steps. For example, automation makes changes in source systems in your Data Vault easy to manage. Whereas changes in the business require modification of the conceptual model, impossible to do with an automation tool.

We agree with Juha when he said, “You can’t automate away the conceptual model or business concepts, but you can automate away many steps after that.”.


CONCLUSIONS

We are grateful for Juha Korpela’s insightful presentation at the Data Vault User Group monthly meet-up. Here are the conclusions that we drew:

  • Data Vault as a methodology has always been defined as business centric. To achieve this, you need to model the Data Vault after the business entities.

  • Source system Data Vaults are easy to generate, but often lead to failures.

  • Conceptual Data Modelling is an excellent method to capture what the business really looks like.

  • We should be able to design the core structure of a Data Vault model without looking at the source systems.

  • Automation is vital but modelling the business cannot be automated. It is a collaborative effort that ensures you’re doing the right thing.

  • Reference models can be useful, but you still must apply thought. Every project and organisation are unique, so you still need to go through the same process

  • Every organisation is unique, but not often as unique as they think they are.

  • Conceptual Data Modelling can work well in an agile project.

bottom of page