Assertions for data integrity on a less-than-ideal data model
25-30 years ago I worked a lot with the Danish developed ERP system called Concorde XAL. It ran on top of a database (choice of several, mostly I worked with installations running on an Oracle database.) If you're curious, take a look at the history behind it - it is actually the predecessor of Microsoft Dynamics 365 Enterprise ERP.
XAL attempted to do data integrity checks application-side rather than in the database - the datamodel didn't always lend itself to foreign key constraints (plus it made it easier for them to port to different underlying databases.)
I'm no longer working with XAL, but now that Oracle has implemented Assertions, I've been thinking about how assertions might have helped with implementing data integrity in the database for a datamodel like XAL.
As all ERP software, XAL has tables for debitors, creditors, stock and a lot more. Here's a screenshot of how it looked in the Windows version (usually I worked with the character-based version running in some Unix using a VT220 80x25 terminal window.)
I've created a small demo with some simplified tables in a datamodel similar to how it was in XAL.
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.2.0 - Production Version 23.26.1.2.0
Before starting, I had granted my schema permission to create assertions:
grant create assertion to kibeha;
Then I took a small subset of the data from my Practical Oracle SQL book and roughly simulated that these were in an XAL datamodel:
-- "Dictionary" metadata
create table dict (
FileId integer
, ColId integer
, Name varchar2(30) not null
, constraint dict_pk primary key (FileId, ColId)
);
-- Debitors
create table DebTab (
DebNo varchar2(10) not null unique
, DebName varchar2(30) not null
, RecId integer primary key
);
-- Creditors
create table CreTab (
CreNo varchar2(10) not null unique
, CreName varchar2(30) not null
, RecId integer primary key
);
-- Stock (products/items)
create table StoTab (
StoNo varchar2(10) not null unique
, StoName varchar2(30) not null
, RecId integer primary key
);
-- Text notes on any table
create table Notes (
RefFileId integer not null
, RefRecId integer not null
, LineNo integer not null
, Text varchar2(255) not null
, RecId integer primary key
, constraint notes_uq unique (RefFileId, RefRecId, LineNo)
);
-- Data
insert into dict (FileId, ColId, Name) values
( 11, 0, 'DebTab')
, ( 11, 1, 'DebNo')
, ( 11, 2, 'DebName')
, ( 11, 255, 'RecId')
, ( 15, 0, 'CreTab')
, ( 15, 1, 'CreNo')
, ( 15, 2, 'CreName')
, ( 15, 255, 'RecId')
, ( 27, 0, 'StoTab')
, ( 27, 1, 'StoNo')
, ( 27, 2, 'StoName')
, ( 27, 255, 'RecId')
, (112, 0, 'Notes')
, (112, 1, 'RefFileId')
, (112, 2, 'RefRecId')
, (112, 3, 'LineNo')
, (112, 4, 'Text')
, (112, 255, 'RecId')
;
insert into DebTab (DebNo, DebName, RecId) values
('C-50042', 'The White Hart' , 14293)
, ('C-51069', 'Der Wichtelmann', 15238)
, ('C-51007', 'Boom Beer Bar' , 17236)
;
insert into CreTab (CreNo, CreName, RecId) values
('B-518', 'Balthazar Brauerei', 51647)
, ('B-523', 'Happy Hoppy Hippo' , 53637)
, ('B-536', 'Brewing Barbarian' , 56627)
;
insert into StoTab (StoNo, StoName, RecId) values
('P-5310', 'Monks and Nuns' , 881645)
, ('P-5430', 'Hercule Trippel' , 862644)
, ('P-6520', 'Der Helle Kumpel', 843643)
;
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values
(11, 14293, 1, 'Deliveries only directly to the publican' , 7853018)
, (11, 14293, 2, 'Ask at the counter for Samuel Barman' , 7563914)
, (15, 53637, 1, 'Can only deliver in Cornelius kegs' , 7453816)
, (27, 881645, 1, 'Available both in cans and bottles, both 330 ml', 7533712)
, (27, 862644, 1, 'Check pre-orders before ordering new stock' , 7153615)
, (27, 862644, 2, 'Brewery can brew small or large batch on demand', 7553511)
;
commit;
The dict table is a simplified version of the metadata that XAL uses. In XAL, all tables are created via the development environment built into the application, table and column names are stored in this dictionary, and XAL then makes CREATE TABLE statements behind the scenes.
FileId is a unique id for each table. ColId is the id of each column. The row with ColId=0 is not a column, it is the name of the table itself.
DebTab, CreTab and StoTab are tables for Debitors (customers), Creditors (vendors) and Stock (products). Note they all have a primary key column RecId that is a pseudo key.
Notes are textual notes that can be attached to any other table - RefFileId refers to the FileId of the parent table, while RefRecId then is the RecId of the parent row within that table.
Assertions case 1
Notes.RefFileId ought to have a foreign key to Dict.FileId. Let's try it:
alter table Notes add ( constraint Notes_RefFileId_Dict_FK foreign key (RefFileId) references dict (FileId) );
ORA-02270: no matching unique or primary key for this column-list
Sure it fails - FileId is not unique in Dict. Maybe we can fake a composite foreign key to match the primary key of Dict table:
alter table Notes add ( constraint Notes_RefFileId_Dict_FK foreign key (RefFileId, 0) references dict (FileId, ColId) );
ORA-03059: The expression 0 is invalid.
Nope, can't have a literal as one of the columns on which we create a foreign key.
So let's emulate the foreign key with an assertion instead, where we can filter that we're only making the reference to those rows in Dict that have ColId=0:
create assertion Notes_RefFileId_Dict_FK check (
all (
select n.RefFileId
from Notes n
) nref
satisfy (
exists (
select null
from dict d
where d.FileId = nref.RefFileId
and d.ColId = 0
)
)
);
The assertion creates successfully and we can test it.
- Attempt to insert with non-existing RefFileId:
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values (42, 66642, 1, 'The answer to everything is forty-two' , 9879878) ;
ORA-08601: SQL assertion (KIBEHA.Notes_RefFileId_Dict_FK) violated.
- Attempt an update to non-existing RefFileId:
update Notes set RefFileId = 42 where RecId = 7853018;
ORA-08601: SQL assertion (KIBEHA.Notes_RefFileId_Dict_FK) violated.
- Attempt an update of parent FileId:
update dict set FileId = 42 where FileId = 11 and ColId = 0;
ORA-08601: SQL assertion (KIBEHA.Notes_RefFileId_Dict_FK) violated.
- Attempt removing parent FileId:
delete dict where FileId = 11 and ColId = 0;
ORA-08601: SQL assertion (KIBEHA.Notes_RefFileId_Dict_FK) violated.
- But no problem if removing different ColId than zero:
delete dict where FileId = 11 and ColId = 1;
1 row deleted.
rollback;
Actually this case might not be the best to demonstrate assertions, as we actually do have an alternative that allows us to use a regular foreign key:
We can create a unique constraint on a virtual column:
alter table dict add ( FileIdColZero invisible generated always as (case ColId when 0 then FileId end) virtual , constraint FileIdColZero_uq unique (FileIdColZero) );
We can then make a foreign key refer to the virtual column:
alter table Notes add ( constraint Notes_RefFileId_Dict_FK_RealFK foreign key (RefFileId) references dict (FileIdColZero) );
That'll work too, and in general you should prefer regular constraints over assertions if possible.
However, you might wish to make an index on this virtual column for performance, and it'll just be a bit of a hazzle to maintain an invisible column with index just to get a regular constraint instead of an assertion. You should weigh the pros and cons and decide for yourself for your own usecase.
Assertions case 2
For the column Notes.RefFileId we saw above that we wanted a foreign key to only some rows of the parent table.
For the column Notes.RefRecId it's the other way - we want a foreign key to a parent table for only some rows of the child table, then a foreign key to another parent table for some other rows of the child table, and so on.
Let's see what we can think of to make this assertion.
Attempt 1
Let's see if we can UNION ALL lookups in the different parent tables:
create assertion Notes_RefRecId_MultiFK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
) nref
satisfy (
exists (
select null
from DebTab d
where nref.RefFileId = 11
and d.RecId = nref.RefRecId
UNION ALL
select null
from CreTab c
where nref.RefFileId = 15
and c.RecId = nref.RefRecId
UNION ALL
select null
from StoTab s
where nref.RefFileId = 27
and s.RecId = nref.RefRecId
)
)
);
ORA-08689: CREATE ASSERTION failed ORA-08663: Set operators are not supported.
Nope - that's not supported in assertions.
Attempt 2
Then we can try OR'ing multiple exists subqueries:
create assertion Notes_RefRecId_MultiFK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
) nref
satisfy (
exists (
select null
where exists (
select null
from DebTab d
where nref.RefFileId = 11
and d.RecId = nref.RefRecId
) OR exists (
select null
from CreTab c
where nref.RefFileId = 15
and c.RecId = nref.RefRecId
) OR exists (
select null
from StoTab s
where nref.RefFileId = 27
and s.RecId = nref.RefRecId
)
)
)
);
ORA-08689: CREATE ASSERTION failed ORA-08697: SYS owned tables are not supported.
Aha, we cannot use DUAL as the base table of an assertion.
(Yes, I know, there's no explicit DUAL in the statement - but even though I'm using the new way of just doing a select without a from clause, behind the scenes it's still doing a FROM DUAL...)
We can try instead of DUAL to select from dict (which we know will exist due to the assertion or foreign key we created above):
Well, we solved the DUAL issue, but again we hit something that's not supported.
If we can't do "exists" with OR, we can turn the logic around and instead of each row in Notes having to satisfy one of a set of exists queries with OR, we can check that there must not exist rows in Notes that do not exist in any of a set of subqueries with AND:
Success! That assertion can be created. So let's test.
- Attempt inserting a non-existing RefRecId:
- Attempt inserting a RefRecId that exists in DebTab but use RefFileId for CreTab:
- Attempt removing a parent row:
So this works, but there is an alternative to the convoluted not exists / not exists construction.
Recall that in the first assertion above (the FileId "foreign key"), we used a predicate in the satisfy subquery to filter the rows of the parent table.
We can create assertions for each of our parent tables by putting a predicate in the all subquery to filter the rows of the child table for which the assertion should work:
Now we have 3 individual assertions, each asserting a "foreign key" relation between a subset of the Notes child table and a given parent table.
Let's test again.
- Attempt inserting a non-existing RefRecId:
- Attempt inserting a RefRecId that exists in DebTab but use RefFileId for CreTab:
- Attempt removing a parent row:
All of these work, just like the Notes_RefRecId_MultiFK did, but we get an advantage that the error message tells us in which parent table we're missing a parent RecId that our child RefRecId is referring to.
I have seen presentations by Toon Koppelars on assertions, diving into details of how Oracle is using many clever tricks to avoid having to full scan tables all the time. Somehow I think (but it is entirely guesswork for now) that the solution with individual assertions for each parent table ought to be the best suited for allowing Oracle to use these tricks to evaluate the assertions as few times as possible and on as few rows as possible. I think (guess) that the combined assertion with AND'ed not exist subqueries maybe can't be implemented as efficiently.
But so far that's pure guesswork - maybe I'll get back to this and dive deeper into performance of the two methods.
For now, I'll just do some cleanup:
And you can get the script if you want to try this out yourself.
Attempt 3
We can try instead of DUAL to select from dict (which we know will exist due to the assertion or foreign key we created above):
create assertion Notes_RefRecId_MultiFK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
) nref
satisfy (
exists (
select null
from dict d
where d.FileId = nref.RefFileId
and d.ColId = 0
and (
(
nref.RefFileId = 11
and exists (
select null
from DebTab d
where d.RecId = nref.RefRecId
)
) OR (
nref.RefFileId = 15
and exists (
select null
from CreTab c
where c.RecId = nref.RefRecId
)
) OR (
nref.RefFileId = 27
and exists (
select null
from StoTab s
where s.RecId = nref.RefRecId
)
)
)
)
)
);
ORA-08689: CREATE ASSERTION failed ORA-08695: A subquery within an OR condition is not supported.
Well, we solved the DUAL issue, but again we hit something that's not supported.
Attempt 4
If we can't do "exists" with OR, we can turn the logic around and instead of each row in Notes having to satisfy one of a set of exists queries with OR, we can check that there must not exist rows in Notes that do not exist in any of a set of subqueries with AND:
create assertion Notes_RefRecId_MultiFK check (
not exists (
select null
from Notes n
where not exists (
select null
from DebTab d
where n.RefFileId = 11
and d.RecId = n.RefRecId
) AND not exists (
select null
from CreTab c
where n.RefFileId = 15
and c.RecId = n.RefRecId
) AND not exists (
select null
from StoTab s
where n.RefFileId = 27
and s.RecId = n.RefRecId
)
)
);
Success! That assertion can be created. So let's test.
- Attempt inserting a non-existing RefRecId:
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values (11, 88888, 1, 'This debitor does not exist in the table' , 8888881) ;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_MULTIFK) violated.
- Attempt inserting a RefRecId that exists in DebTab but use RefFileId for CreTab:
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values (15, 14293, 1, 'Deliveries only directly to the publican' , 8888882) ;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_MULTIFK) violated.
- Attempt removing a parent row:
delete StoTab where RecId = 881645;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_MULTIFK) violated.
So this works, but there is an alternative to the convoluted not exists / not exists construction.
Attempt 5
Recall that in the first assertion above (the FileId "foreign key"), we used a predicate in the satisfy subquery to filter the rows of the parent table.
We can create assertions for each of our parent tables by putting a predicate in the all subquery to filter the rows of the child table for which the assertion should work:
drop assertion Notes_RefRecId_MultiFK;
create assertion Notes_RefRecId_Deb_FK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
where n.RefFileId = 11
) nref
satisfy (
exists (
select null
from DebTab d
where d.RecId = nref.RefRecId
)
)
);
create assertion Notes_RefRecId_Cre_FK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
where n.RefFileId = 15
) nref
satisfy (
exists (
select null
from CreTab c
where c.RecId = nref.RefRecId
)
)
);
create assertion Notes_RefRecId_Sto_FK check (
all (
select n.RefFileId, n.RefRecId
from Notes n
where n.RefFileId = 27
) nref
satisfy (
exists (
select null
from StoTab s
where s.RecId = nref.RefRecId
)
)
);
Now we have 3 individual assertions, each asserting a "foreign key" relation between a subset of the Notes child table and a given parent table.
Let's test again.
- Attempt inserting a non-existing RefRecId:
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values (11, 88888, 1, 'This debitor does not exist in the table' , 8888881) ;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_DEB_FK) violated.
- Attempt inserting a RefRecId that exists in DebTab but use RefFileId for CreTab:
insert into Notes (RefFileId, RefRecId, LineNo, Text, RecId) values (15, 14293, 1, 'Deliveries only directly to the publican' , 8888882) ;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_CRE_FK) violated.
- Attempt removing a parent row:
delete StoTab where RecId = 881645;
ORA-08601: SQL assertion (KIBEHA.NOTES_REFRECID_STO_FK) violated.
All of these work, just like the Notes_RefRecId_MultiFK did, but we get an advantage that the error message tells us in which parent table we're missing a parent RecId that our child RefRecId is referring to.
I have seen presentations by Toon Koppelars on assertions, diving into details of how Oracle is using many clever tricks to avoid having to full scan tables all the time. Somehow I think (but it is entirely guesswork for now) that the solution with individual assertions for each parent table ought to be the best suited for allowing Oracle to use these tricks to evaluate the assertions as few times as possible and on as few rows as possible. I think (guess) that the combined assertion with AND'ed not exist subqueries maybe can't be implemented as efficiently.
But so far that's pure guesswork - maybe I'll get back to this and dive deeper into performance of the two methods.
For now, I'll just do some cleanup:
drop assertion Notes_RefFileId_Dict_FK; drop assertion Notes_RefRecId_Deb_FK; drop assertion Notes_RefRecId_Cre_FK; drop assertion Notes_RefRecId_Sto_FK; drop table Notes purge; drop table StoTab purge; drop table CreTab purge; drop table DebTab purge; drop table dict purge;
And you can get the script if you want to try this out yourself.
Now you know ways of doing "foreign keys" that work for subsets of rows either in the parent or the child. Maybe that'll be useful for you one day.


Comments
Post a Comment