top of page
Laptop keyboard, coffee, sticky notes, and pencils on wood background

Understanding Non-Historized Links in Data Vault

Based on episode 10 of The Business Thinking Podcast with our CEO, Neil Strange, and AutomateDV Product Manager, Alex Higgs.


Non-Historized Links, formerly known as transactional links, are a unique and powerful variant of Links within the Data Vault 2.0 architecture. This article explores the intricacies of Non-Historized Links, their applications, and best practices for implementation, drawing from our recent podcast discussion.


What Are Non-Historized Links?

Non-Historized Links are essentially Links with additional grain. As Neil Strange explains, "So really what a non-historized link is, is a link with additional grain." This means they capture relationships at a finer level of detail than standard Links, which typically represent the relationships between Hubs forming a unit of work.


Unlike standard Links, which connect Hubs based on their business keys, Non-Historized Links require an additional column in their primary key to ensure uniqueness. This additional column is often a transaction number or timestamp, reflecting the finer grain of the data.


A Practical Example: Financial Transactions

Consider a financial system where transactions result in multiple postings against ledger accounts. Each posting has the same grain, meaning multiple entries exist for the same Hub combinations. As Alex Higgs illustrates, "So essentially what you have is a transaction happens. Let's go. Let's look at finance system. OK, so we have a transaction and insider transactions, a number of posts that are made against you, alleges."


In this scenario, a Non-Historized Link is used to capture each posting, with the transaction number or timestamp serving as the additional primary key column. This allows for the accurate representation of all postings related to a single transaction.


Key Characteristics and Applications

  • Grain Shift: Non-Historized Links address scenarios where the grain of the data extends beyond the Hubs involved.

  • Transaction Logs: They are commonly used for transaction logs, journals, and other similar data sources.

  • IoT Data: They can also capture time-series data from Internet of Things (IoT) devices, where measurements are recorded at specific intervals.

  • Payload Integration: When the data within a transaction is immutable, the payload (attributes) can be directly integrated into the Non-Historized Link, eliminating the need for a separate Satellite.


Best Practices for Using Non-Historized Links

  • Identify Additional Grain: Carefully identify the additional column required to ensure uniqueness in the primary key.

  • Handle Data Immutability: If the data is guaranteed to be immutable, consider integrating the payload into the Link for performance optimization.

  • Avoid Double Feeds: Implement left outer joins to prevent duplicate loading of transactions.

  • Consider Reference Data: Be mindful of reference data that may reduce the number of Hubs involved in the Link, potentially leading to a Non-Historized Link resembling a Satellite.


Insertion Patterns and Performance

Inserting data into Non-Historized Links is typically straightforward, involving the addition of new transactions to the end of the table. However, it's crucial to implement measures to prevent duplicate loading, especially when dealing with large transaction tables.


Final Thoughts

Non-Historized Links are a valuable tool for modeling data with a finer grain than standard Links. By understanding their characteristics and applying best practices, data engineers can effectively capture and represent complex transactional and time-series data within their Data Vault implementations.

For more insights and discussions on Data Vault, check out our Data Vault User Group website, where you can access past meetups, Q&A forums, and additional learning resources.

bottom of page