Originally authored in Feb 2014; refreshed in Jun 2015. See change log at the end of this article for a list of updates
- a paper trail of all your changes to data in a table - useful when it comes to auditing, providing version history to users etc. This post covers how to achieve Time Travel-esque behaviour using an AWS PostgreSQL RDS instance via triggers.
Time Travel achieves version history by means of two columns (which could incidentally be called anything) that track when the row was valid from, and when it was valid to. By means of an example (you can ignore the details for now, we come to them later), let us first create a piece of fruit:
myitcv=# insert into fruits (name) values ('apple'); myitcv=# select * from fruits; id | name | valid_from | valid_to ----+-------+-------------------------------+---------- 1 | apple | 2015-02-25 19:27:18.011267+00 | infinity
Creating a new record inserts a new row into our table.
valid_from = now() corresponds to the creation time. Notice
valid_to = infinity - this tells us that the row is current.
If we then change the name of the piece of fruit (notice we use the restriction
valid_to = infinity to refer to the
myitcv=# update fruits set name = 'pear' where id = 1 and valid_to = 'infinity'; myitcv=# select * from fruits; id | name | valid_from | valid_to ----+-------+-------------------------------+------------------------------- 1 | apple | 2015-02-25 19:27:18.011267+00 | 2015-02-25 19:27:50.242802+00 1 | pear | 2015-02-25 19:27:50.242802+00 | infinity
The update has been translated into two changes:
valid_toon the now old version to
- Insert a new row to represent the new version, following the same logic for
valid_fromas for insert
And then finally delete the piece of fruit; we don’t need it any more:
myitcv=# delete from fruits where id = 1 and valid_to = 'infinity'; myitcv=# select * from fruits; id | name | valid_from | valid_to ----+-------+-------------------------------+------------------------------- 1 | apple | 2015-02-25 19:27:18.011267+00 | 2015-02-25 19:27:50.242802+00 1 | pear | 2015-02-25 19:27:50.242802+00 | 2015-02-25 19:28:21.066919+00
Notice the row itself is not deleted, rather the
valid_to date is simply updated from
infinity -> now().
This is in effect a soft delete.
With appropriate indexes and restrictions on key constraints, this can in effect give you a version history, or paper trail if you will, of changes to rows in your table. Nothing gets deleted; we append to the table any changes, including deletes. Good for auditing… Good for all manner of things.
How do I use Time Travel?
It depends on where you are using PostgreSQL.
Recent version of PostgreSQL under your control
Time Travel lives on in later versions of PostgreSQL via the spi contrib module. If you are in control of your PostgreSQL installation, you can simply install the extension:
myitcv=# create extension timetravel;
then follow an example given in the PostgreSQL source to get up and running and start travelling back through time immediately. Eat your heart out, Marty McFly.
At this point you could stop reading this article because you have a working time machine! What follows might however be
of interest: it is a trigger-based equivalent to the spi module that (importantly) brings the
timetravel behaviour to
PostgreSQL on AWS.
PostgreSQL on AWS
As the legacy
v6.3 documentation points out however,
timetravel-like behaviour can be achieved using triggers. The goal of the remainder of this post, as mentioned
earlier, is to try and achieve just that.
What follows assumes:
- The you are familiar with PostgreSQL
- That you have a PostgreSQL instance running on AWS (my tests were against a
- You can connect to that instance using the PostgreSQL CLI or some GUI tool (to execute arbitrary SQL)
Installing the main functions
To get up and running, you will need to execute the following SQL to create the two functions that simulate
timetravel. One will be called from a before trigger, the other from an after trigger (we show how to create the
The code itself is commented to give some motivation behind certain decisions. But the one point worth making regards deletes.
A delete could simply be achieved by intercepting a user request to delete and translating that to update the
= now() on the corresponding row. We would therefore silently ignore the request to delete something. However,
the effect of this is to return that 0 rows were affected by the delete. Client libraries using our table might well have problems
with this (ActiveRecrod with optimistic
locking does for example). Hence we have to go
to the effort of inserting a new row to represent the old version (much like we do for an update) and then allow the
original delete to continue.
The functions in action: fruit trigger
By themselves, the aforementioned functions are useless. So how do we use them? The following SQL gives just such an example:
DROP TABLE IF EXISTS fruits; CREATE TABLE fruits ( id SERIAL NOT NULL, name TEXT, valid_from TIMESTAMP WITH TIME ZONE NOT NULL, valid_to TIMESTAMP WITH TIME ZONE NOT NULL ); DROP TRIGGER IF EXISTS fruits_before ON fruits; CREATE TRIGGER fruits_before BEFORE INSERT OR UPDATE OR DELETE ON fruits FOR EACH ROW EXECUTE PROCEDURE process_timetravel_before(); DROP TRIGGER IF EXISTS fruits_after ON fruits; CREATE TRIGGER fruits_after AFTER UPDATE OR DELETE ON fruits FOR EACH ROW EXECUTE PROCEDURE process_timetravel_after();
Again, execute this SQL to create the tables and triggers, at which point you should be in a position to run the SQL found earlier in this post where we inserted, updated and deleted an apple (pear).
Feedback, corrections, suggestions etc. would be greatly appreciated. Indeed, please let me know if something should be added to this list:
- The main functions assumes that the table columns that provide
valid_toare called just that.
timetravelprovided support for providing alternative columns (e.g.
ending); our version is limited in that respect
- This whole process only safely works within a transaction; our version does not check that it is called within the context of a transaction
In writing this trigger-based alternative to
timetravel, I’ve relied heavily on:
- PostgreSQL’s trigger documentation
- PostgreSQL’s explicit locking documentation
- Documentation on PL/pgSQL - SQL Procedural Language
- Various posts by Pavel Stěhule including this one
- An important note about executing dynamic commands
Thanks to the many contributors to PostgreSQL and its community for what is a great DB.
Update 2015-03-22: reverted use of
now() at time zone 'utc' because it’s superfluous. Added license details</br>
Update 2015-02-25: updated to use
now() at time zone 'utc' and
TIMESTAMP WITH TIME ZONE for safe default