Monday, 20 January, 2025
The DIY Bitemporality Challenge
By Eric Newcomer, CTO and Principal Analyst, Intellyx (guest author)
Handling the element of time in SQL has been a challenge since it was created. The challenge was formally recognized in 1992, and over the following years multiple proposals were advanced to address temporality in the SQL standard.
The initial proposals were controversial, so it wasn’t until the publication of the 2011 version of the standard that temporal elements became an official part of the language.
It’s difficult to address the challenge of temporality because SQL is designed and engineered for use with current databases, that is, databases designed for a query to return only the current value of a data item following an INSERT or an UPDATE operation.
Historical values are not preserved in current databases, making it difficult to determine any value of a data item other than its current value (see also the earlier blog post in this series - Why it’s time to retire CRUD). Richard Snodgress, who worked for years to resolve the temporality problem, describes the challenge in great detail in his book.
In short, current databases have to reason about time based on the current values of data items, rather than being able to query historical values, as temporal databases can – especially a bitemporal database. Current databases retrieve the current values of time and date elements and compare them to derive a data item’s history, which requires a lot of work to create the right schema and to analyze query results.
Some databases have added support for the SQL:2011 temporal extensions to their current databases, but if you have already been using the database as a current database, it is often difficult if not impossible to retroactively add temporal support.
If temporal support is needed, it’s easier and cheaper to start with a bitemporal database such as JUXT’s XTDB. Otherwise, you are going to have to implement the time element in SQL yourself.
Considerations of Temporality
A temporal database, and especially a bitemporal database, implements SQL syntax to record the valid time and the transaction time for each insert or update operation.
Bitemporality works by recording the date at which a data item became valid (i.e., came into existence in "the real world") and by also recording the time at which this fact was recorded (i.e., the time of the transaction that updated the database with this value).
Valid time tracks the time period during which a fact is true in the real world, such as how long someone lived at a certain address, or owned a particular car.
The transaction time records the history of updates to a data item to support a query to return the value of the item at that time.
Finding and fixing an error in a current database that has a mismatch between "the real world" and the values in the database can be difficult and time consuming. Furthermore the business may not be able to operate until the error is fixed, for fear of compounding or continuing to make the error.
While a current database assumes that the current value is the only value of interest, a bitemporal database assumes that the dates at which the current value became current is of interest, allowing you to "go back in time" and see what values were current at previous points of time, and more easily discover when an error occurred and fix it.
Coding Temporality
It can be challenging to code such temporal logic yourself, not to mention error prone. Using a current database, you would have to store a separate record for each date value you wanted to preserve. You could not update existing records, because an update operation replaces the previous values with the new values.
Next you’d have to write complex queries to retrieve, sort, and aggregate the records according to the date, or date range you wanted to see.
The first blog post in the series covers the use case of compliance, retrieving the value of data at a specific point in time at which the application (or database) was non compliant.
Other use cases include determining the value of an asset over time, such as a stock, bond, or derivative and analyzing price fluctuations for trends and predictions.
A simple example is tracing the price of a product, such as an electric bicycle, to see which price point drives the most sales.
If you are tracking the price over time using a current SQL database, you have to design the schema to accommodate the additional required data items, and the approach for the data items to be added to each record considered time sensitive. Then you have to include those items on the INSERT, UPDATE, and QUERY operations for those records.
For example:
-- retrieve last week's version of a product
SELECT * FROM Product
WHERE id = 'An Electric Bicycle'
AND VT_Begin <= DATE 2025-01-13
AND DATE 2025-01-13 < VT_End
AND TT_Start <= DATE 2025-01-18
AND DATE 2025-01-18 < TT_Stop
-- retrieve the full product history as known on 2025-01-01
SELECT * , VT_Begin, VT_End
FROM Product
WHERE id = 'An Electric Bicycle'
AND TT_Start <= DATE 2025-01-01
AND DATE 2025-01-01 < TT_Stop
It’s much simpler to record and trace price updates using a bitemporal database:
-- built-in bitemporal columns are hidden by default
INSERT INTO product (_id, name, price) VALUES
(1, 'An Electric Bicycle', 400)
UPDATE product SET price = 405 WHERE _id = 1
UPDATE product SET price = 350 WHERE _id = 1
SELECT *, _valid_from
FROM product
FOR VALID_TIME ALL -- i.e. "show me all versions"
FOR SYSTEM_TIME AS OF DATE '2025-01-31' -- "...as observed at month end"
Results:
1 '2025-01-10T00:00:00Z' 'An Electric Bicycle' 350 1 '2025-01-05T00:00:00Z' 'An Electric Bicycle' 405 1 '2025-01-01T00:00:00Z' 'An Electric Bicycle' 400
The syntax for the bitemporal database to accomplish the same thing as a current database is much simpler and easier to work with for use cases subject to temporal considerations.
The Intellyx Take
Traditional relational databases are designed and engineered to capture the current state of the business. In the context of time, or temporality, they are called current databases.
For many applications, current databases are fine – applications in which all you really care about is the latest answer. For example, you may not care which other rooms in the hotel a guest may have occupied in the past when you are trying to find out which room they are currently in.
However, for applications that do care about time, current databases require you to manually insert time values that track when a data item was added, when it became valid, and when it became invalid (i.e. when it was updated to another value). This is very error prone and labor intensive. Many organizations that have such temporal use cases tend to deprioritize this work when they see how costly it is.
It’s very common, for example, for identity validation programs to check your prior addresses or prior vehicle ownership records to confirm that you are who you say you are. Regulated industries often are required to provide historical values for private data, for example, and financial services companies typically assess the value of traded assets such as stocks, bonds, and derivatives over time.
A bitemporal database, i.e. one that naturally incorporates the element of time, eliminates such manual programming effort by automatically including date and time information when inserting and updating data.
And while bitemporalty has been added to a greater or lesser extent to some databases, a purpose built bitemporal database such as JUXT’s XTDB is going to be the easiest and best.
Copyright © Intellyx B.V. Intellyx is editorially responsible for this document. No AI bots were used to write this content. At the time of writing, Juxt is an Intellyx client.