Sunday, March 3, 2013

Data integrity constraint classification

Before we start investigating the complexities involved in implementing data integrity constraints using database triggers, we will first introduce a classification schema for data integrity constraints. Agreeing upon a classification scheme for constraints, helps us a lot in implementing them: for we can then approach the problem area on a class-by-class basis.

A classification scheme needs to have a few properties:
  1. The constraint classes should be mutually exclusive: any given constraint should clearly fall into one, and only one, of the classes;
  2. The scheme should cover all types of constraints: i.e. it cannot be that we can think of a constraint that doesn't fall into one of the defined classes;
  3. It should be practical: and by this we mean it should help us when implementing constraints. Or, put in another way, the issues we have when implementing constraints of the same class should be similar for these constraints, and the issues we have for implementing constraints of a different class, should be different.
A commonly used classification scheme for constraints, that has all above properties, is one that takes the increasing scope that a constraint can have in a database design, as the driver for defining classes.

Class 1: Attribute constraints.
These are constraints that can be specified (and thus checked) by referring to (or inspecting) only one column value of a tuple (ie. row).

Class 2: Tuple constraints.

These are constraints that can be specified (and thus checked) by referring to (or inspecting), at least two column values of a tuple. This class can also be called 'multi attribute' constraints.

Class 3: Table constraints.
These are constraints that can be specified (and thus checked) by referring to (or inspecting), at least two tuples of a table. This class can also be called 'multi tuple' constraints.

Class 4: Database constraints.
These are constraints that can be specified (and thus checked) by referring to (or inspecting), at least two tables. This class can also be called 'multi table' constraints.

In case the above scheme is totally new for you, here are a few examples:

Attribute constraints
  • Salary must be a positive integer between 750 and 14000.
  • Job must be one of the following list: MANAGER, CLERK, SALESMAN, etc.
Tuple constraints
  • A manager cannot have a salary less than 5000.
  • An employee working in department 10, cannot be a salesman.
Table constraints
  • There cannot be two employees with the same employee number.
  • A manager must always be accompanied by a clerk in the same department (hmm, sounds familiar this one...)
Database constraints
  • Every employee must work in a known department.
  • No (non SF-based) department can have a higher salary budget then any San Francisco (SF) based department.
Did you spot the primary/unique key? And the foreign key?

Just verify for yourself that this classification scheme covers all constraints. Think of a few constraints in your database design, and see in what classes they fall. It could be that you find a constraint that's not covered yet. That would be because above four classes jointly only cover  static constraints. These are constraints that can be verified by looking at a database state (or snapshot) in rest. There is another class in itself for what are called dynamic constraints. Dynamic constraints, sometimes referred to as transaction or database-state change constraints, cannot be verified by looking at a database state at rest. That's because the specification of these constraints require the context of a transaction. Examples of dynamic constraints are:

  • Cannot change the salary of an employee without the same transaction also writing a log-record in [some table] that details the user who performed the change and the time when this was done;
  • We register new hires at most one month before they join the company. Or, put in another way: the hiredate of an employee cannot be more than one month in the future;
  • (a classic one) Salary can only increase, never decrease.

The four static constraint classes plus the dynamic constraint class are,

  • all mutually exclusive, and
  • together cover all possible constraints.

And as it so turns out this classification scheme is also a practical one. Similar issues arise when implementing constraints of the same class. We'll further investigate that in our next post, where we will start with implementation guidelines for constraints on a class-by-class basis.


1 comment:

  1. Clear, complete and very nicely explained. Good article!

    ReplyDelete