Triggers Considered Harmful, Considered Harmful: And what about table constraints?

Thursday, March 7, 2013

And what about table constraints?

In a previous post we've introduced a classification scheme for constraints:

attribute constraints
tuple constraints
table constraints
database constraints
dynamic constraints

And talked a bit about how we could implement the first two classes here. In today's post we will make a start talking about how we can implement table constraints using triggers. But before we do that we will offer some thoughts on how the ideal world with regards to this subject would look like.

Long ago, in a galaxy far away, an ANSI/ISO SQL standardization committee came up with the ground breaking concept of a SQL ASSERTION. SQL assertions would make our lives real easy when it comes down to implementing table constraints. The example constraint "we cannot have a manager without a clerk in the same department" could be implemented as:

create assertion managers_need_clerk as
check(not exists
        (select 'a department'
         from (select distinct deptno from emp) d
         where exists 
                 (select 'a manager in d'
                  from emp e
                  where e.deptno = d.deptno and e.job = 'MANAGER')
           and not exists
                 (select 'a clerk in d'
                  from emp e
                  where e.deptno = d.deptno and e.job = 'CLERK')
        )
     )
/

Presto. Done.
It would then be up to the DBMS to maintain this constraint. Of course we require the DBMS to do that in an efficient manner. This will all become clear during the course of the next couple of posts.

By the way you may think, why not just do this with a CHECK constraint?

alter table emp add constraint managers_need_clerk as
check(not exists
        (select 'a department'
         from (select distinct deptno from emp) d
         where exists 
                 (select 'a manager in d'
                  from emp e
                  where e.deptno = d.deptno and e.job = 'MANAGER')
           and not exists
                 (select 'a clerk in d'
                  from emp e
                  where e.deptno = d.deptno and e.job = 'CLERK')
        )
     )
/

Well that's because CHECK constraints do not allow sub-queries. They don't allow that for the same reason as why we still do not have support for assertions. Both require the DBMS vendor to produce some seriously complex piece of software that can accept an arbitrary complex boolean SQL-expression and compute from that the most efficient way to maintain that boolean expression inside all concurrently running transactions. The research and development effort for this still needs to be done.

So, we have to resort to some other means when it comes to implementing table constraints. Here's a list of possible implementation strategies.

So we've discussed the first one already: it's highly preferred, but unfortunately only a very partial solution. The only table constraints that we can deal with declaratively are:

Keys (be them primary or unique), and
Foreign keys, in case the FK refers back to another column in the same table (in which case the foreign key is a table constraint, and not a database constraint).

The trigger approach is what the rest of this blog will be all about. Every table constraint can be implemented using triggers. Contrary to popular belief this is doable. But it *is* rather complex. The fact that this is a full solution for the table constraint class, is a big pro though.

Then there is the API-approach. This is the approach where you encapsulate all DML statements inside stored procedures, and disallow any direct DML access to your tables. The only way to modify your tables is through the stored procedure API-layer. And inside this API-layer, you deal with constraint validation, just before or right after you issue the DML statements. To me this is a fundamentally flawed solution, since it will always lead to constraint enforcing code duplication. And since it is not a "once and for all" solution. Every time you maintain your application and need to introduce new transactions, you'll have to take care of constraint enforcement again. Many people also tend to completely disregard the complexities involved. To a certain extent, they are the exact same complexities as are involved in the trigger approach: you'll have to take care of serialization and efficiency (all explained in future posts) in this approach too. And finally, in practice it is very difficult to maintain the enforcement of only allowing access to you tables via the API-layer. At some point people will find ways to go around the API-layer, and thereby be able to introduce corrupt data.

And there is the function based index trick. This is where we can use unique, function-based, indexes to implement keys across a subset of rows in a table. This too is a very partial solution for implementing table constraints. I'm assuming, since you read this blog, you are familiar with this trick, otherwise let me know in a comment and I'll provide you with an example.

Another approach is to employ materialized views in an ingenious manner. Like triggers, this is in theory a full solution, but in practice only a very partial one. It all boils down to the same research and development effort mentioned above when we discussed assertions, not having been done yet. We will discuss the materialized view approach in our next post.

Stay tuned.

14 comments:

AnonymousMarch 7, 2013 at 6:44 AM
"... At some point people will find ways to go around the API-layer, and thereby be able to introduce corrupt data. ..."

don't know what you mean by that, can you give an example.
At some point people can always introduce corrupt data for example - to take it to the extremes - by editing binary data_files.

secondly, yes, please point me to / give an example of "the function based index trick"

Thanks !
ReplyDelete
Replies
Rob van WijkMarch 7, 2013 at 6:51 AM
Why do you think the API approach always leads to constraint enforcing code duplication?
ReplyDelete
Replies
John FlackMarch 8, 2013 at 10:16 AM
I've used the function-based index trick for some kinds of table constraints. See my blog post, No Duplicates within Details
ReplyDelete
Replies
AnonymousMarch 17, 2013 at 12:40 AM
Toon your killing me. Such an interesting approach to integrity constraints and you have yet to reach the table, database, and dynamic constraints. But seriously, I appreciate the time and effort you've taken to make these concepts straight forward. I was wondering if you've written any papers on this subject?
ReplyDelete
Replies
AnonymousNovember 25, 2018 at 8:19 AM
Nice blog..! I really loved reading through this article. Thanks for sharing such an amazing post with us and keep blogging...Well written article Thank You for Sharing with Us pmp training in chennai | pmp training class in chennai | pmp training near me | pmp training courses online | pmp training fee | project management training certification
ReplyDelete
Replies
Dharani MDecember 11, 2018 at 12:19 AM
Nice blog
thanks for sharing this blog
java training in Marathahalli

spring training in Marathahalli

java training institute in Marathahalli

spring and hibernate training in Marathahalli
ReplyDelete
Replies
mounikaDecember 11, 2018 at 2:57 AM
DOT NET training in btm

dot net training institute in btm

dot net course in btm

best dot net training institute in btm
ReplyDelete
Replies
QuickBooks Error Support NumberMarch 4, 2021 at 2:19 AM
Nice Blog !
Our team at QuickBooks Phone Number have been serving QuickBooks users for the past few years in light of the Corona Crisis.
ReplyDelete
Replies
QuickBooks Support Phone NumberMarch 7, 2021 at 1:23 AM
Your Blog is very nice, and it's very helping us this post is unique and interesting, thank you for sharing this awesome information.If you face any problem in QuickBooks, Contact:QuickBooks Customer Service Phone NumberFor Quick resolution.
ReplyDelete
Replies

Add comment