Validating Couchbase Referential Integrity with N1QL

TL;DR: Join the bucket to itself with an outer left join on the foreign (reference) ID, where the right side of the join is missing.

Most NoSQL databases do not have built-in support for referential integrity and Couchbase Server is no exception. The best you can do is check which documents have broken references so you can fix them manually or programmatically. In Couchbase it's possible to write a MapReduce view that will find all the documents that have a broken reference to another document, but the view definition is pretty complex and coupled to a specific document field. If you want to check the integrity of another field, write another view.

With the advent of N1QL, we now have a much better way to test referential integrity with a relatively straightforward query. You still can't enforce integrity, of course, but at least you can easily check if it's broken and do something to fix it retroactively.

Let's assume we have two document types: <source_type> and another type that the source document references. We'll also assume that the referenced document ID is stored in the ref_id field, formatted as "<prefix><id><suffix>" - you can remove the prefix/suffix if it's not relevant, of course. The query is as follows:

SELECT META(b1).id AS source_doc, 
       b1.ref_id AS missing_target_doc
FROM <bucket> b1 
LEFT OUTER JOIN <bucket> b2 
ON KEYS '<prefix>' || b1.ref_id || '<suffix>'
WHERE b1.type = '<source_type>' AND b2 IS MISSING;

As you can see above, we're joining <bucket> to itself with an outer left join, where the left side (b1) is the source document itself and the right side (b2) is the document it references. We're interested in any results from the left side of the join that don't have a value on the right side (MISSING) - those are the documents that have a broken reference. In this case, as the names imply, source_doc is the ID of the source document, and missing_target_doc is the ID of the missing target document.

Note 1: The || operator denotes string concatenation in N1QL.

Note 2: This works in N1QL DP4, I didn't test it with DP3 because I'm lazy and you really should switch to DP4 anyway. :)

David Ostrovsky

I'm a level 38 Computer Geek, with experience in everything from databases to mobile apps. I'm also a Senior Solutions Architect at Couchbase, where I often keep clients happy by coding hard.

comments powered by Disqus