Showing posts for tag "dgqf"

DGQF and DQL as I Understand Them

Jul 26, 2018, 12:27 PM

Tags: domino dgqf

At CollabSphere this year, the big information coming from HCL was detail about the Domino General Query Facility (DGQF) and its associated language, Domino Query Language (DQL). They originally announced this a few weeks ago, but it was good to have had some time to let the dust settle and to see the specifics.

Because it was discussed alongside the domino-db Node.js package and because it's one of the first real new ways we'll interact with data in a Domino DB in a while, it's a bit difficult to identify just what it is and what it is not. Here's how I understood it:

What DGQF Is

DGQF is, at least conceptually, a "meta" layer on top of the existing NIF indexing facility. It doesn't provide a core change to the actual storage of documents, but instead treats existing view indexes as (roughly) analagous to both SQL table indexes and SQL views. It trawls through the design elements of a database to analyze their selection formulae and columns to use applicable ones as implicit indexes and also to allow access to arbtirary collections within queries.

Implicit Indexes

Other than the design collection and the "optimize document table" option in a DB, an NSF doesn't really have much in the way of indexing note contents by default. So, if you have a query asking for all documents where FirstName is Bob, a program has no choice but to look through every document for that key/value match. If, however, you create a view that has a column showing the FirstName field, you now have a much-faster index you can use. It's this sort of view that the DGQF picks up on implicitly, using them to accelerate queries: views showing all documents with either a default sort or "click to sort" column showing explicitly a field (and not a formula).

Access to Arbitrary Collection Data

For those qualifying views plus others, you can reference a view by name or alias to compare to a column value by programmatic name (often either the field name for simple columns or something like $4 by default for formulas).

"In" clauses

Additionally, you can use view (and folder, I think) names to refine queries for documents that are in one or more of these collections, equivalent to an "in" subquery or view reference in SQL

What DQL Is

In short, DQL is the human-readable query language used to access DGQF. It's reasonably SQL-like (though it is not SQL) and tends to look like FirstName='Bob' and in all ('Managers', 'Active Users'). This is the language you will use, and so "DGQF" and "DQL" will generally refer to the same thing in practice.

In practice, this is implemented as a new method on the Database class in each high-level language supported by Domino, plus a Node-styled variant in domino-db.

What DGQF and DQL Are Not

Since DGQF sits on top of NIF (and probably the FT index eventually), it's not a core change to data storage. Eventually, the same abilities and limits of Domino remain as they are with respect to this.

Additionally, DQL is, I believe, a query language only: it does not provide a mechanism for creating, modifying, or deleting existing documents. Instead, it is essentially a super-powered and much-smarter version of…): you can use it to find documents and the processing of them is up to your program.

That last point was a bit muddied by its pairing with the domino-db Node.js package: the Node.js package provides bulk operations that are paired with DQL queries, but that is a function of that library specifically, not DQL or DGQF.

Why It's Cool

Though it's not a reworking of the core NSF, what DGQF does do is abstract away a lot of the manual looping and lookups that we've always had to do, and it allows the system to optimize and do things more efficiently than when written out procedurally. So, while there's theoretically nothing that DGQF does that we couldn't do before, it allows us to do those things with far, far less code and with automatic optimization.

This brings Domino something that SQL servers have enjoyed for a long time. With a SQL statement, you can analyze the trouble spots of a slow-running query and add indexes to improve the speed, with the tooling helping to explain what's going on. DGQF+DQL brings this along for the ride: when you execute a DQL query, you have the option to dump out this "explain" output to see what specifically the facility did, which views it used, and how long each step took. So, if you have a long-running query, you can look to see if you can add an "index" view to automatically speed it up without having to change your code. And, since the language is an abstraction over the task of querying and not the sort of "burned in" process of a normal getNextDocument loop, it can be optimized and short-circuited by the underlying system without the developer having to know the decades of built-up knowledge of how to efficiently search a DB.

All in all, this is a very welcome addition to the server, and it certainly should improve a lot of common tasks.