How the ODP Compiler Works, Part 7

  • Jul 8, 2019

In this probably-final entry in the series, I'd like to muse a bit about possible future improvements and additions to the compiler and the NSF ODP Tooling generally. For the most part, the big-ticket future additions seek to answer one question:

Could this be used to replace Designer?

The quick answer is "yes, it could", but that would take a lot of work. There are a couple things inherent in the task and specific to my implementation that both help and hinder this kind of thing.

Notes Runtime

The biggest stumbling block is the hard requirement on a Notes or Domino runtime initialized for the current process. Being able to use C API calls is required by both my code and some of the underlying XPages bits, and that means initializing the runtime. The good news here is that it doesn't require any specific Notes-based program - it can be run with either the libraries that come with Notes or Domino, and it doesn't require Designer at all. That loosens things up a bit, but still means that one of the supported platforms is obligatory at some step of the process.

Even on a supported platform, though, it's not just as easy as calling an init function - the process's environment needs to be set up specifically to know about the Notes program and data directories, and this varies platform-by-platform. This means that it wouldn't be straightforward to have, for example, an Eclipse plugin that initializes the process, since it would depend on initialized environment variables and loading paths implicitly referenced by lower layers, and over which the programmer doesn't have much control after the fact.

The good news here is that the tooling is already designed to support remote work for compilation and export, both truly remote and with the local Equinox runners. For a true IDE experience, the communication between the IDE and compiler would have to be more complex than the "tell the compiler what to do and hear messages back" simple mechanism it has now, but it'd still be a natural evolution.

OSGi Runtime

The requirements posed by compiling complicated XPages applications presents a similar dependency as above, but on an Equinox environment. Though it's possible to fake the basics of OSGi for known plugins, that wouldn't work for arbitrary third-party libraries.

For integration in Eclipse, this wouldn't necessarily mean any new work - Eclipse is already the premier Equinox product, and so it supports what XPages compilation needs innately. However, Eclipse wouldn't be the only target; any work done here should work with other IDEs like IntelliJ, but also continue to work IDE-free via a Maven or Gradle environment.

So this ends up being another strong argument for retaining the "separate process" model that already exists.

Incremental Compilation

Beyond retaining the runtime requirements, the big thing would be a switch to supporting incremental compilation. Currently, the compiler is designed to do everything in one pass: you point it at an ODP and it spits out a freshly-created NSF. This allows it to build up and tear down its environment cleanly, initializing the OSGi plugins for any XPages libraries at the start and doing similarly for any custom classpath jars to be included in the Java runtime.

What supporting incremental compilation would require to be at all speedy and efficient is having a persistent compilation environment. Instead of everything happening sequentially, the IDE would init the compiler process and then send it requests as files need compilation. This has implications for both local and server-based compilation.

Local complication would need to change less: mostly, it would require picking an IPC mechanism and having the launched Equinox process remain alive until it's no longer needed.

Server-based compilation would be similar in implementation, probably using something like HTTP long polling to be able to run in the Domino HTTP container. The trouble would be that a straightforward implementation of this would mean that the Domino server would pretty much have to be dedicated to a single IDE. There's already a potential conflict scenario with two developers doing compilation at the same time: since the XPages compiler needs to install and uninstall OSGi bundles, they could step on each others' toes if any of them overlap. Keeping the compiler environment resident on the server would mean it would have to be effectively locked out to one connection for long periods of time. Assuming HCL continues the Community Edition licensing model, this will be legal to do, but it's still cumbersome.

This could lead into something I've been mulling over: running a Domino compiler server in Docker. This would loosen a lot of the runtime requirements and mean that the encapsulated Domino server would be both dedicated to the purpose and consistent from the perspective of the compiler. Domino's setup requirements initially made it an awkward fit for Docker, but it looks like things have progressed along nicely.

This would all tie perfectly into the Language Server Protocol, which is an IDE-agnostic way to do basically this: have a little running process that knows about the nitty-gritty of the language, and then tell the IDE only what it needs to auto-complete and other features.

Live NSFs

Currently, the compiler starts with an ODP and emits a clean NSF with each build, and this is absolutely, 100% the correct way that it should work. However, Notes being what it is, it'd be expected that a Designer replacement would be able to work with a live NSF, so you could just crack one open, change a view, and be done. The second part of this process is in there, since the compiler uses the normal Notes APIs to store in an NSF as it is. It's the first part that would have to be new, allowing the tooling to selectively look into an NSF.

The exporter already does this, but, like the compiler, goes in one pass. What would potentially make sense would be to do essentially what Designer does: implement a VFS layer to represent an NSF in an equivalent way to the on-disk project. It's more easily said than done, but would be particularly straightforward for Eclipse, for the same reasons that it was straightforward for IBM to do it for Designer.

The secondary question here would be if it would be better to do continue to use DXL as the sole transport mechanism (so editing a view in a live NSF would export it to DXL and then re-import on save) or to instead try to represent things differently. Though DXL is less efficient, particularly for large notes, I think it'd make sense to stick with it - there would be tremendous work involved in trying to make it smarter, and that would be a breeding ground for bugs that just wouldn't exist with DXL.

IDE Features

Getting an NSF to compile dynamically is one thing, but the other part of this kind of project would be making the experience of working with design elements pleasant. In Designer, we have the benefit of having purpose-built editors for each design element type, but these aren't portable even if licensing allowed: the legacy ones all are just wrappers around C++-based "native" UIs, while the newer-era ones are based on Designer's bizarre internal RPC system.

I've done some work along these lines, initially to add autocomplete for custom controls and known core+ExtLib controls to .xsp files. Since that earlier work, I also added a contributor that tells Eclipse to use the DXL schema file for DXL files. While this doesn't give a proper GUI editor, it does provide enough information for Eclipse to pick up on the allowed elements and properties:

Eclipse DXL schema support

While I don't think it'd be worth trying to fully reproduce the various WYSIWYG editors Designer gives you (particularly the view editor, which is laughably bad for data-centric use), I think it'd be worth adding some editors along the lines of my old Forms 'n' Views project. Having some basic editors with a strong focus on the resulting data structure would be perfect for XPages support use and even mostly useful for legacy use.

Time

The core trouble with getting to all these goals, though, is time. For the main compilation and export work, I could justify spending a good amount of time because it eventually more than paid off in less time fighting with Designer to create consistent builds. For this other stuff, though, it's more dependent on whether my hatred for using Designer is enough to tilt the scales. Sometimes, it almost gets there, but I do also need to be able to pay my mortgage, so that puts a bit of a limit on things. It sure would be nice to leave Designer in the dust for good, though.

How the ODP Compiler Works, Part 6

  • Jul 7, 2019

In this post, I'd like to go over another main component of the NSF ODP Tooling project, the ODP Exporter. The exporter is significantly simpler than the compiler, but still had a surprising number of gotchas of its own.

My goal in writing the exporter was to replace the need to use Designer to create an on-disk project out of an NSF - in one of my projects, in addition to the primary NSF we use, there are also a dozen or so secondary NSFs inheriting from templates and being modified by people not using Git (I know.), and keeping them all in sync is a giant PITA. Previously, I had a dedicated VM just to open the DBs periodically to sync them, but even that took a long time and got error-prone when Designer would miss a change or generally trip over itself.

So I set out to create a compatible replacement, so that I could run a script and update the ODPs en masse.

The Basics

At its core, the exporter does what you might expect: it reads through each design note and sends them through a DXL exporter. For its work, it makes use of the aforementioned design collection and IBM's NAPI. I went with IBM's variant in this case for one class: com.ibm.designer.domino.napi.design.FileAccess. Though this class let me down when it came to importing, it has just enough encapsulated composite-data reader methods to save me a ton of work here, though I had to cheat to access one of them.

For each design note, it determines its type, which contains behavioral information for each type, including whether it should be included in the normal export process at all, where it's placed in the ODP, and the type of export treatment it should get. The main categories of exported note types line up with what the compiler had to know about, with some special knowledge of whether a note is one of many in a folder (e.g. an XPage) or one-per-database (like the icon note).

The Gotchas

Unsurprisingly, things aren't quite as simple as a basic loop. For elements that are just DXL files, it is that easy, but the ones that exist either as just file data (e.g. "plugin.xml") or file data plus metadata require special handling.

Skipped Items

The first thing to note with split data/metadata items isn't complicated, but bears mentioning: the metadata file is generated by exporting the design note as DXL but ignoring data items. Some of these are common among all types, but others (like indexed "$ClassData0", etc. fields) are best matched and excluded with a regex.

LotusScript, Again

LotusScript libraries threw me for an unexpected loop this time. Their storage format is actually not even composite data: the script itself is stored in plain non-summary text items, multiple ones with the same name. However, the NAPI's DXL exporter doesn't actually export the full script test properly, instead only outputting the content of the first item. Additionally, the legacy Java API shows the presence of multiple items with the same name, but also only gives you the value for the first.

So I ended up having to use the raw note format in memory, which DOES include all of the items, and then stitch the script content together onto the filesystem.

Other File Data

The other data types aren't too complicated, but need special cases for each composite data structure, which is where the FileAccess class comes in. Without the convenience methods there, I would have had to write CD iterators to read the data based on the appropriate structures - not terribly difficult, but it's all the better to have the work already done for me. Especially so since FileAccess pleasantly writes directly to a java.io.OutputStream, just like I'd want if I wrote it myself.

Special-Case Files

There are three final special cases that the exporter handles:

  1. The icon note is specially-exported not once, but twice. It's exported using an NAPI-specific special method to create the "database.properties" file, which includes the ACL and and formatted settings alongside the icon note, and then also exported specially after the main loop as "Resources/IconNote". I've always appreciated how much Lotus wildly overloaded the icon note.

    • There's also a distinct "$DBIcon" note that houses the 32-bit icon introduced in R8 (if I recall correctly), but that's just a normal old image resource with a special name and not related to the icon note.
  2. The "META-INF/MANIFEST.MF" file resource became important in 9.0.1 FP10, but Designer's handling of it is a little schizophrenic. FP10+ will usually fill it in with plugin information when it rebuilds an NSF, but nonetheless exports it as a zero-byte file. It's important for it to exist, so I create a blank file if it doesn't exist.

  3. The Eclipse ".project" file is also not present in older NSFs, but is critical for ODPs. If it wasn't exported, I create a generic stub version.

Swiper

When dealing with on-disk projects, Swiper is a mandatory tool, cleaning up the generated XML and (critically) removing extraneous items that change too frequently to be source-control-friendly.

The core of Swiper is an XSLT stylesheet to do the transformation, and I incorporated this wholesale, with a minor modification to retain the ACL that's stripped out by stock Swiper. I then created an OutputStream implementation that passes DXL output through Swiper if configured. As a small note, I think there was a specific reason why I have the Swiper path buffer the DXL into an in-memory ByteArrayOutputStream first instead of just wrapping the file output stream, but I don't remember what that was.

Final Steps

With this post, I think I've covered the big topics I set out to with the two main components of the Tooling. I plan on having at least one final post in the series to cover some potential future additions and enhancements, since I have a lot of ideas in mind for it. Unfortunately, a lot of the most-useful ideas would also be tremendous amounts of work, but the payoff may eventually be worth it.

How the ODP Compiler Works, Part 5

  • Jul 5, 2019

One of the things that came up frequently when writing both the compiler and exporter portions of the NSF ODP Tooling was rationalizing the multiple ways an NSF is viewed, and determining which aspects are reified in the design notes themselves and which are entirely runtime conjurations.

The Traditional View

To describe what I mean, I'll start with the "traditional" way that design notes work, which is also the mechanism the other views are built upon. The starting point there is the distinction between data and design notes, represented in the API as the note class. For our purposes, there's "data note" and "everything else". The design notes are kept track of internally by what can be considered a magic view, the design collection, which is used implicitly whenever something looks up a design element, and can be accessed automatically by API calls like NIFFindDesignNoteExt.

The design collection itself acts like a normal view, containing columns with pertinent design element information for fast lookups. Beyond the note class value, design notes are distinguished by character-based flags, which you can see in the "Fields" part of the property pane in Designer in the $Flags item. These will look something like "gC~4K" - this value comes from an XPage, and can be interpreted by taking each character and looking for it in "stdnames.h" from the C API:

  • g is DESIGN_FLAG_FILE, referring to a "file resource"-type design element (more on this later)
  • C apparently matches to DESIGN_FLAG_NO_COMPOSE, used to refer to forms that don't show up in the "Create" menu. I'm not sure why it's included here; it may have some second meaning
  • ~ maps to DESIGN_FLAG_HIDEFROMDESIGNLIST, presumably to keep XPages out of standard File Resource pickers
  • 4 maps to DESIGN_FLAG_HIDE_FROM_V4, which is reasonable advice, but the fact that this is 4 and not 7 makes me suspect there's a second meaning here too
  • K maps to DESIGN_FLAG_XSPPAGE, cheerily documented in the API as "an xpage, much like a file resource, but special!"

The importance of these flags and the reuse of some note classes (file resources in particular) bares a bit of the evolution of the platform. The older the note type is, the more likely it is to have a dedicated note class value. Forms, views, ACLs, and other primordial elements have eponymous classes, but, starting around the web era, new elements started piggybacking on existing classes. This has so far culminated with the XPages-era additions, where almost everything is considered a "file resource", which are themselves already specialized "forms". This mirrors the evolution of file data stored as Composite Data structures, where the first file types added in got their own dedicated structures, later types (like JavaScript libraries) were either crammed awkwardly into similar types or just plopped in as CDFILESEGMENTs (which Domino adorably refers to universally as "CSS").

With the heavy use of flags came something of a mini query language to distinguish collections of design notes. In the API, you can see these in the DFLAGPAT_ C constants. For example, DFLAGPAT_FORM maps to "-FQMUGXWy#i:|@0nK;g~%z^" - the - at the start means that this is a "none of these" matcher, so it's resolved by looking up all notes with NOTE_CLASS_FORM and then filtering out any of the ones with those flags. From our XPage example, you can see it's excluded thrice over, via g, K, and ~. There are other permutations in the language for "match all of", "match any of", and combinations of all three types, and the patterns allow you to select each type of design element you see in Notes and Designer, and a few more categories besides.

Designer's View

Designer really has two views of the NSF. The first view is essentially a codification of what's above, and has been how Notes and Designer have worked forever. When you go to the "Forms" list in Designer, it does a query in the design collection similar to the above form example, and each category of design elements has its equivalent query.

Its second view came along with the Eclipse transition, and it's what you see in Package Explorer. This version takes the core querying capabilities of the design collection and maps it on to an Eclipse File System plugin. From Eclipse-Designer's point of view, the NSF becomes a file-based project as if it was a set of folders and files on the filesystem, but is in reality composed of some dynamic lookups from the design collection paired with a truly local temporary directory for the Local XPages compilation scratch area.

This view of the design became the basis of on-disk project support, with the ODP mirroring what you see in the virtual Eclipse project.

It's also where we start to see a secondary hierarchy within the database design. Traditionally, design notes are largely "flat": while the UI and some APIs have special support for the use of \ within design element names, there's no concept of containers beyond the main categories. For XPages, though, they started to add items like $ClassIndexItem, which contain values like "WEB-INF/classes/frostillicus/controller/ControllingViewHandler.class". These files show up within the "WebContent" folder in the virtual project - "WEB-INF/classes" is by default hidden in Package Explorer, but you can see it in the Navigator view. The use of "WebContent" as a folder for this is itself a holdover from Eclipse-based web app development.

Domino's View

Like Designer, Domino has two views of an NSF and the first is a pretty direct use of the design collection. It has simpler needs, usually just looking up elements by type + flags + name, using reverse view order for web elements.

The second view can be thought of as a stripped-down version of Designer's VFS, but it isn't implemented in the same way and doesn't include all the top-level folders you see in Designer. Instead, Domino uses the aforementioned Java class index items and some other existing values like file-resource names to compose something that resembles a WAR file - you can see this reflected in its use of "WEB-INF/classes". It's this view of the NSF that the XPages runtime container and its many abstraction classes use, allowing them to treat it as an app container in the same way as a normal JEE web app, as if it was just another WAR file. It's not treated fully the same as a WAR file - you can't plop a web.xml file in there and use some other web toolkit - but that's the concept that the XSP stack is going for in classes like com.ibm.domino.xsp.module.nsf.NSFComponentModule.

The On-Disk Project Version

As I mentioned, the ODP is based closely on the Designer view, which in turn is partially based on the "web app" view used for XPages. For the compiler, it's not too big of a deal - it just needs to gather files and import them based on their existing DXL for the most part - but the exporter has to do some fiddly work to shuttle notes to their right spots. By total coincidence, that will be a nice lead-in to my next post, which I expect to cover some details of the ODP exporter portion of the NSF ODP Tooling.

How the ODP Compiler Works, Part 4

  • Jul 3, 2019

In today's post, I'd like to go over a bit of how the NSF ODP Tooling project is organized, and specifically how I structured it to support both server-based and local compilation.

Setting aside the feature, update site, and distribution modules, the tooling consists of seventeen code-bearing components:

For our purposes today, we care about the first six in the "plugins" directory and then the "nsfodp-maven-plugin" at the bottom - the rest have to do with the different capabilities of the suite.

Commons

The three "commons" plugins contain a set of utilities and data-description classes, and they're broken up into those three modules due to differing dependencies. The core "commons" plugin relies only on org.eclipse.core.runtime, while the "dxl" plugin adds an IBM Commons dependency, and finally the "odp" plugin relies outright on a Notes/Domino runtime. By keeping these things distinct, it lets me keep track of which things are safe to include in the Eclipse UI plugins or the Maven plugin, where I can't count on the present of a Notes runtime.

"Servlet" and "Equinox"

The compiler, like the other "action" components of the Tooling, is split up into the core "compiler" plugin that does the actual heavy lifting, and then two "interface" plugins for running the code from different directions.

The "servlet" plugin came first and is the mechanism by which a local Maven-run plugin communicates with a remote Domino server with the Tooling installed. It contains a primary entrypoint servlet that accepts a packaged zip file from the client containing the ODP and any extra update sites to use while building, as well as a set of HTTP headers describing the various parameters that can be set for compilation. Strictly speaking, this plugin doesn't depend on Domino as such, but rather on having a servlet container and a Notes or Domino runtime - it could hypothetically run in e.g. Tomcat with the right dependencies, but in effect it's the "Domino side" of it.

The "equinox" plugin supports local compilation and it's a bit of an interesting beast. Since the compiler is intended to work with any given NSF and XPages application, it has a hard requirement on the presence of an Equinox ("Eclipse-style") OSGi runtime. A local Maven build doesn't use Equinox, so I wrote this plugin to provide what Equinox refers to as an "application" - essentially a named executable class that can be run once you initialize an Equinox runtime. Eclipse itself uses this mechanism for running, and you can see these in action in the "Eclipse Application" run configuration type in Eclipse-the-IDE:

The code itself behaves similarly to the servlet, but can skip the "zip container" step of the process, instead referencing the local files based on system properties set by the Maven bootstrapper.

Having these two entrypoints lets me keep the actual business of the compiler independent. Depending on need, I could add any number of other entrypoints without having to modify the core code at all.

Maven-side

The actual action of the process is kicked off by a Maven plugin, which consists of what Maven calls a "mojo". It's effectively the same idea as the Equinox application: a specially-tailored executable class. In this case, it gains the ability to specify parameters that are passed in by the pom.xml configuration or via the command line, which are then available by the time the Maven runtime calls the execute() method.

When run, the Maven mojo branches based on what type of compilation it can do. The servlet-based compilation branch is a little wordy in the class, but conceptually simpler than the local compilation. The mojo creates a temporary zip file, pours the ODP into it, and then adds in any update sites to include. Then, it creates an HTTP connection to the remote server, adds headers to configure the compiler, sets the POST body to the zip file, and then lets the server do its thing.

The Equinox runner, though... that's something that took a surprising amount of fiddly magic to get working. On a conceptual level, running the compiler in a local Equinox container is essentially the same thing as the Equinox container launched by the Domino HTTP process - same Eclipse runtime, same infrastructure, and so forth. However, the trouble came in both in some of the fiddly ways that the Domino OSGi runtime is configured and in the assumptions it makes about the active JVM, and the battle resulted in a complicated bootstrapping process. The Notes JVM comes packaged with a handful of critical jar files, not the least of which being "Notes.jar", and those need to be added to the active classpath, which in turn needs a specialized provider plugin to get Equinox to see them. There's also whatever the heck "JEmpower" is, which has its own special needs to be wrapped up into a "shim" OSGi plugin because of the way other plugins depend on it. The runner is also riven with special behavior for running on recent macOS Notes builds, which switched to an embedded non-J9 JVM (I wouldn't be surprised if this changes subtly again in the future). This is all in service of creating a compatible Equinox configuration so that, finally, the compiler can be run in a child process. It's not pretty, but it works.

Progress Messages

Since the process can take a while, I created an extremely-bare-bones messaging system, where the server sends out a series of JSON objects delimited by newlines and the client watches for these and emits human-friendly messages. The compiler process itself just uses an Eclipse-style IProgressMonitor - the servlet uses an implementation called LineDelimitedJsonProgressMonitor, while the local Equinox runner uses one that just prints to the console directly. This is another area where things are kept generic enough at the internal level so that a different mechanism entirely could be hooked in - a GUI progress monitor, for example.

Overall Structure

I'm pretty pleased with how the structure of the Tooling has taken shape. Being able to separate out the entrypoints like this definitely made it much easier to have the local compilation, and breaking it all out into multiple modules kept me from baking in any incorrect assumptions about the runtime environment in the core code. I've been toying with ideas for how to get this stuff to run in an Eclipse/IntelliJ/etc. environment, and I think the Maven Equinox runner will provide a pretty good template for that. There'd be a lot of work to make that good, but it'd definitely be possible.

How the ODP Compiler Works, Part 3

  • Jul 2, 2019

In the first two posts in this series, I focused on the XPages compilation and runtime environment, independent of anything to do with an NSF specifically. I'll return to the world of OSGi and servlets in later entries, but I'd like to take a bit of time to talk about some specifics of grafting the compiled XPage results and the rest of the on-disk project's contents into an actual NSF.

The Basics

The primary tool that makes an on-disk project work is DXL, the XML representation of a note. DXL defines representations for several kinds of Notes elements, but the three main kinds that you run into with an on-disk project are:

  • Database metadata, found in the annotingly-suffixed AppProperties/database.properties file. This contains information from a couple places, in particular the ACL and icon notes

  • "Raw" representations of design notes. These show up quite a bit if you select "Use binary DXL" in Designer's preferences, and they show up in a couple parts regardless of that selection. These are distinguished by their use of <note/> as the root element, and contain close to raw data from the NSF. Strings, numbers, and dates are represented in human-readable form, but things like composite data/rich text are stored as Base64-encoded byte arrays matching their in-memory C structures. These "blobs" are opaque to work with but are the safest to round-trip.

    • A subtype of this is the "*.metadata" files, which I'll cover shortly.
  • "Encapsulated" design notes, with root elements like <form/> and <view/>. These are friendly to look at and work with programmatically, but the forms in particular run the risk of some edge-case compatibility issues.

The Process

The ODP Compiler uses DXL for almost all of its NSF manipulation, and imports the ODP in a couple of passes based on the different needs of different design elements.

"Direct DXL" Elements

The easiest elements are the ones that are just single DXL files in the ODP and can be imported directly. The compiler iterates over these files as determined by the OnDiskProject class and just passes them in to the DXL importer. Easy peasy.

"Split" Elements

The second main type are resource files that are stored in the ODP as their "normal" file data and paired with a ".metadata" file. The prime example of this are file resources: if you have a file named "foo.txt" stored as a file resource in your NSF, it will exist in the NSF as a normal text file named "foo.txt" and next to it will be a trimmed-down DXL file named "foo.txt.metadata". These metadata files are an export of the "raw" format of the DXL, but then the actual file data items are removed, leaving them contain just the additional items that go along with that (flags, in-NSF file name, etc.).

The conceptual task here is straightforward: encode the file data back into the appropriate composite-data format as Base64 inside the DXL, and then import that. The actual task of doing that, though, gets pretty arcane. There were two ways I could go about it: import the metadata only and then use the C API (via one route or another) to create the structures in memory and append them to the note, or create a C-struct-compatible representation in-memory in Java and add it to the DXL to import. I originally planned on doing the former, as the com.ibm.designer.domino.napi.design.FileAccess class in IBM's NAPI has promisingly-named classes to do this, but I ran into some trouble with some file types that it doesn't support - though file resources, images, script libraries, and others are all conceptually the same thing, the actual C-level storage mechanism for each is slightly different. So I ended up going the latter route, which entailed writing some gnarly code to do it in memory.

XPages Elements

For the most part, XPages and related elements (Custom Controls, themes, Java class files, and Jars) are supersets of file resources: they use the same composite-data structures and store the programmer-visible data in the same $FileData items in the destination notes. Each has an extra layer, though, in order to store the Java bytecode and other info.

Both XPages and Custom Controls share a code path that stores their compiled data into the $ClassData0, $ClassData1, $ClassSize0, and $ClassSize1 items, since they consistently have one class to represent the main page and then a second inner "Page" class to act as an internal component constructor. In addition, Custom Controls store their ".xsp-config" data in $ConfigData and $ConfigSize items in the same note in the NSF.

Java design elements are conceptually similar, but have less predictable class names, and so the code is a little more complex. There's also some special behavior here, in that there are a handful of compiled classes that show up in the compilation result that aren't directly stored in those files. I forget what those are specifically - they might be for secondary, non-public classes that appear at the top level of a Java source file but aren't inner classes.

All of these, in addition to storing their source and class names, also sprout a $ClassIndexItem item that lists the "file paths" for the classes to be used as part of the virtual filesystems that Domino and Designer use when initializing the XPages app.

LotusScript

LotusScript libraries areā€¦ special. Though LotusScript embedded in other design notes (forms, views, agents, etc.) doesn't require any special handling beyond importing the DXL, libraries stored as ".lss" files in the on-disk project aren't automatically compiled.

These libraries are brought in with their source stored as normal text items named $ScriptLib, but then need to be compiled from there. There's no mechanism for compiling LotusScript in the normal Java API, and IBM's NAPI doesn't have a binding to the NSFNoteLSCompileExt function involved, so I had to dip into Java C API bindings, initially via Karsten Lehmann's excellent Domino JNA and then switched over to Darwino's NAPI implementation.

If you look at the algorithm I'm using to compile the libraries, you may notice how brute-force it is. Any given library may depend on any given other library, but I don't have a way to know that ahead of time without parsing the code (which I don't want to do). So, in lieu of the kind of dependency graph that Designer creates when you do "Recompile all LotusScript", the ODP Compiler tries each library in turn and, if one fails, it adds it to a "try again" queue. It does this until it's had a chance to effectively try each combination, at which point it will either have a clean queue and can proceed or it'll have one or more libraries that failed to compile for a different reason. It's not pretty, but it gets the job done.

Standalone Elements

There are a handful of components of an ODP that are stored as plain files without associated DXL, generally to do with XPages support files like "xsp.properties". These have some special support in the FileResource class to auto-vivify an associated DXL file on the fly. Fortunately, these files are pretty basic to create, and the only catch was figuring out the appropriate $Flags and $FlagsExt values to fill in. For this, the OnDiskProject class has a set of matchers to match known paths to the specialized file-resource behavior needed for each.

Miscellany

Beyond just importing the ODP files into their right places, the compiler does a few other notable things.

It has the option to populate the $TemplateBuild shared field with template name and build time information, which I've found to be extremely handy. I used to have an agent in a separate DB that would update this in my template DB, and it's much nicer to have the compiler do this automatically. It's also a pleasant fit-and-finish thing.

Similarly, I used to have to remember to take a moment to make sure that the Xsp Properties file in the NTF was set to use compressed and aggregated resources, which was easy to forget. Now, I can have that happen automatically, via filtering during resource import.

Designer exports the "database.properties" file with the current full-text-index status intact, which can actually cause trouble when importing back into a new database. I had to strip that part out if present.

LotusScript compilation relies on the presence of web-service classes in the current Java runtime, which caused trouble when I added local compilation without a Domino server. I guess that this is a knock-on effect, with loading the compiler having the secondary effect of warming the JRE in case you're compiling web services.

Because I forgot that DbDirectory#createDatabase exists (or maybe it was limited, I don't remember), I ended up adding DB creation to Darwino's NAPI, which is a handy capability to have anyway.

Remaining Topics

The more I write about this, the more I find is still left worth covering. In particular, I'd like to go over the architecture of the compiler, how it's architected to run both on a remote server and via a local Equinox OSGi environment. There's also the whole matter of the ODP exporter, which is technically separate but related by workflow and a source of its own bits of arcane knowledge. So much to cover!

 

How the ODP Compiler Works, Part 2

  • Jul 1, 2019

In yesterday's post, I briefly touched on how the XPages runtime sees its environment by way of a FacesProject and related components. Today, I'd like to expand on that a bit, since it's useful to understand the various layers of what makes up an "XPages app" at compilation and runtimes.

Designer and Domino largely take two paths to try to arrive at the same location in how they view an NSF. The way Designer works is more complicated and opaque than Domino, with extra layers of VFS and an internal RPC mechanism(!) for editors, but there is at least some shared code from the XSP runtime. Beyond that, it does almost the same thing to determine the project's dependency classpath, while the internal NSF classpath is entirely distinct, using Eclipse's project structure to builds towards the different structure Domino will use.

Libraries

The notion of an XSP Library is one of the main parts of directly-shared code between the server and Designer. The way an XSP Library works is that you create a class that implements com.ibm.xsp.library.XspLibrary and then declare that as an IBM Commons extension contribution (more on that later) for the com.ibm.xsp.Library service type.

The fact that this is live code sitting in a plugin has a significant implication. Namely, anything that interprets it has to actually load the class and its dependencies. This is as opposed to just a static configuration file, which could be read without executing any custom code. For the server, the distinction doesn't matter too much, since you'll want to load all your class files anyway. For Designer, this is where we get the requirement to install libraries into Designer itself, rather than just adding plugins to the Target Platform. This is also an area that's a breeding ground for IDE bugs, since Designer needs the plugin available both internally and in the Target Platform, but they're not inherently tied together.

Though the XspLibrary implementation class is executable code, its main purpose is to point the runtime to various bits of static configuration information: the unique identifier for the library (e.g. com.ibm.xsp.extlibx.bazaar.library), lists of *.xsp-config and *-faces-config.xml files to define XSP and JSF contributions, and a list of other library IDs that this one depends on.

I believe that Designer and Domino use these bits of information slightly differently - I'm not sure that Domino cares too much about the *.xsp-config files, for example - but there's a lot of overlap here.

Configuration Files

The two main types of static configuration files used by libraries serve distinct purposes.

The *-faces-config.xml files (not required to be so named, but it's a good convention) are layered under the faces-config.xml file contained in your NSF. They define managed beans, converters, PhaseListeners, and other JSF-isms. These files come directly from the underlying JSF implementation and share the same syntax, at least until the JSF-1.2-era forking of XPages.

The *.xsp-config files look similar - they also use the <faces-config/> root element - but I believe that these are largely an XSP-specific detail. It looks like JSF 1.2 also uses the same <faces-config-extension/> tag, but to a different end - perhaps this evolution started the same way but then diverged there. In any event, these files are where Designer (and the XSP compilation process in general) looks for custom-defined components and their accessible properties. There's an interesting point to note there: though defined components are effectively beans with properties, Designer doesn't introspect the object to get its property names and types, but instead relies entirely on the definitions found in these files. It will still eventually use the component class when it goes to compile the translated XSP Java files, so they still need to be correct, but it's certainly a spot where it's easy to make a typo or mismatched property type.

I think that the latter files aren't used by the server, since their purpose is to provide the XSP source Java translator with mappings for components' XML elements to the Java classes. However, the core XPages runtime classes on the server still retain knowledge of this configuration, which is how the Bazaar and ODP Compiler do their thing. The com.ibm.xsp.registry package and sub-packages are filled with a mix of parser classes and in-memory representations, like com.ibm.xsp.registry.parse.ConfigParserImpl and com.ibm.xsp.registry.LibraryFragmentImpl.

Non-Library Contributions

Though not related to libraries, it's useful to know about a handful of XPages-specific class contributions that can come into play at runtime. These use the IBM Commons extension mechanism, like libraries themselves, but contribute to a good many different parts of the runtime and application flow. Some of these can be defined inside an NSF, while some are only recognized when defined in plugins - there's a good rundown of these on the ODA wiki. It's pretty rare to see these in the wild, but you may see an application here or there that uses these contributions, via in-NSF files like META-INF/services/com.ibm.xsp.core.events.ApplicationListener.

OSGi and Dependencies

In the early days, XPages was not OSGi-based. That came in in the 8.5.2 era (I believe - I wasn't aware enough in the 8.5.0/8.5.1 era to know the specifics) with the "extensibility API". For the most part, this lineage remains, and the XPages runtime itself isn't too dependent on OSGi, even when it comes to library contributions. Little bits have crept in here and there - the getPluginId() method in XspLibrary and the getOSGiBundle() method in ExtLibLoaderExtension, for example - but it's still largely incidental.

IBM Commons Extensions

If you've done both XPages plugin and Eclipse-the-IDE plugin development, you may have noticed that, while Eclipse plugins usually contribute to customized extension points with complicated schemas, XPages contributions all look like this:

	<extension point="com.ibm.commons.Extension">
		<service type="com.ibm.xsp.Library" class="com.example.SomeXPagesLibrary" />
	</extension>

There are still some places in Domino where you use different extension points, such as when you register a servlet with the Equinox OSGi runtime directly, but for the most part it's just this one point. This is because this extension point is designed to paper over the differences between OSGi extensions and the vanilla-Java-style ClassLoader#getResources mechanism. The type of the service you provide lines up with the META-INF/services/some.extension.type files you can use in your NSF and which still remain inside the embedded jars in the core XPages plugins.

The reason why this OSGi extension point exists is that OSGi intentionally creates separations between the individual plugins that make up your app runtime. In a "normal" web app, all of your dependency jars end up in the WEB-INF/lib sub-directory and are effectively all poured together to make a single class-loading environment. The ClassLoader#getResources route will look through all of the jars in the classpath for these META-INF/services files, but OSGi puts walls between them, and instead provides its own extension mechanism (among others, but this is the one Domino uses).

Dependency Resolution

Both Domino and Designer view the NSF like an OSGi plugin, but go about resolving the dependencies slightly differently. Fortunately, this is a case where the differences seldom crop up in practice - I've only seen some minor differences in how they honor the Export-Package directive in the bundle manifest and how fragment bundles are included.

When Designer is building an XPages app, it references the xsp.properties file to determine which XSP Libraries to include, and then uses their getPluginId() method to determine which OSGi bundle that matches up to (I think). It adds that plugin to the list of dependencies in plugin.xml and (since 9.0.1 FP10) META-INF/MANIFEST.MF. The Eclipse side of Designer then uses that to compose the Plug-in Dependencies list from those bundle IDs and any of their dependencies that are marked as re-exported. I think that Domino only cares about the generated plugin.xml/MANIFEST.MF files - I don't think that it does the resolution based on the library class, though I might be wrong about that.

ODPCompiler's Version

Currently, the ODP Compiler hews closer to the "Domino-style" route. For resolving the active class path, it trusts that the plugin.xml that exists in the ODP is correct and resolves dependencies from there. In the future, it may make sense to have the compiler generate the plugin.xml file itself, in which case it will also have to resolve the plugins based on the library classes. That wouldn't be too difficult, but for now it relies on the exported ODP.

Layer Cake

Looking at the whole XPages architecture, something that strikes me is how much it's simultaneously a giant stack of parts - config parsers, resolvers, runtime bootstrappers, and so forth - but also a pretty straightforward server-side web stack from a Java EE perspective. I've been diving deep into XPages in various ways for a long time now - building complex apps, writing library plugins, and even yanking the runtime out of Domino - yet writing the compiler led to this whole distinct set of capabilities. But a lot of this is essentially "just" ahead-of-time work, with Designer and the ODP Compiler's jobs being a lot of world-resolution followed by placing compiled pieces into the right places in the NSF.

By the time it gets to the NSF, it actually ends up as a pretty normal-style web app - XPages are just Java classes floating around, non-OSGi dependency jars are in WEB-INF/lib, and the faces-config.xml controls rendering in the same way as in JSF. A lot of that, though, will come up in later posts, where I go into the gotchas involved in taking these compilation results and other ODP resources and actually getting them into an NSF.

How the ODP Compiler Works, Part 1

  • Jun 30, 2019

A year ago, I started a project to compile NSFs - and particularly large XPages projects - independently of Designer. Since the initial releases, the project has grown, gaining an ODP exporter, extra Eclipse UI integration, and the ability to run without installing components on a remote server. It's become an integral part of my workflow with several projects, where I include the NSFs as part of a large Maven build alongside OSGi plugins and a final distribution.

Building this tooling required learning a lot about the internals of XPages, the specifics of how various design elements are stored and handled in an NSF, and miscellaneous bits about Equinox and Maven. Since there's a good amount of arcane knowledge embedded in the project, I think it'll be helpful to take some time to dive deep into what's going on, starting with XPages.

XSP to Java to Bytecode

The first challenge for me to overcome was how to go from XPages XML source to the Java class files (like those seen in the "Local" source folder in Designer) and finally to compiled Java bytecode. Much like in Designer's process, the middle part is just incidental: only the XSP source and the bytecode are actually stored in the NSF.

The official XSP -> Java compiler exists only in Designer, and so one route would be to try to get those plugins working on Domino. I think that'd work, but it'd be a huge hassle and, fortunately, an unnecessary one. The XPages Bazaar project contains essentially a clean re-implementation of the glue code required to coax the runtime into emitting what I needed. I used the Bazaar as an incubator for the early versions of the compiler, and tweaked the core code with additions and fixes to work with various needs I ran into.

OSGi Bundles

On its own, the Bazaar did the job well of taking an XPage and compiling it with whatever the surrounding environment had. However, to compile a full XPages app as part of a Maven build, I'd need the ability to dynamically load XPages libraries and dependencies on the fly.

To do this, I added the option to include an update site directory, and then I have the compiler stream through all the plugins and initialize them. Fortunately, the OSGi environment makes this pretty easy. The BundleContext object that's available from each OSGi bundle has an installBundle method you can use by pointing at a bundle file URL, and then you can call start() on that result to actually initialize it. I had to do a little extra work to account for bundles that shouldn't be started (source-only bundles and "fragments", which are additions onto normal plugins) and the like, but it's not too complicated.

XSP Libraries

Just installing the OSGi plugins isn't enough to get the XPages runtime to know about any libraries that may be included, however. This is done by finding all of the library extension contributors, sorting them for compatibility, wrapping them in a com.ibm.xsp.library.LibraryWrapper for some reason, wrapping that in a com.ibm.xsp.registry.config.SimpleRegistryProvider, and then adding the results of that to an in-memory com.ibm.xsp.registry.FacesSharableRegistry instance.

This is one of those cases where the actual code involved in the end isn't terribly long, but the amount of delving into the framework to figure out the needed parts was immense. In essence, each XPages application is represented as a FacesProject implementation object, which retains a registry of the libraries it knows about. In a normal running server, this happens automatically during initialization: the runtime opens the NSF, figures out which libraries it needs, finds them from the ones it knows about, and constructs the app's contained world. For the compiler, I ended up having to do something of an ad-hoc version of this as it goes, finding all the parts that need to be kicked to notice the available libraries and have them around to map something like <abc:someCustomComponent/> to an instance of com.abc.xsp.CustomComponent.

Custom Controls

Custom controls are a similar story, but they use a specialized variant of the "real" library system called LibraryFragment. The way these work, before the actual XSP -> Java compilation step, the compiler reads their definitions and adds them to the in-memory FacesProject. It's important to define them all in this way before actually trying to interpret their source (or any XPages), because this allows for the interpreter to understand a reference to one CC from another. Fortunately, the process is separated enough that the definitions can all happen before any of the Java classes actually exist. Otherwise, I would have had to either try to make a dependency graph of which CC references which other, or just keep trying to compile them repeatedly until it got the right order by brute force. The latter process will actually show up in a future blog post.

Java Source Files

Compiling individual Java source files - both those that show up in the Code/Java part of an NSF (or a custom source folder) as well as the translated XSP source - is handled by the Bazaar, in a class called JavaSourceClassLoader. This class wraps around the in-memory Java compilation capabilities that were added to the JDK in Java 1.6. In particular, this class, paired with SourceFileManager, provides knowledge to resolve classes from OSGi bundles in the active environment, including specialized knowledge of dependency management based on re-exported dependencies, embedded jars, and other OSGi-isms that play a big part in XPages libraries.

In essence, these classes provide a similar compilation environment to what Designer does when it creates the Eclipse project for compiling an NSF. They build up an environment with knowledge of all the dependencies that, in Designer, show up in "Plug-in Dependencies", and then the compiler feeds it all of the Java source files in the project. With all of this environment provided, the underlying Java compiler is able to do its job of converting them to bytecode en masse, which is then passed back to the compiler for insertion into the NSF.

Other Big Components

In the next blogs posts in this series, I'll go over some of the other big hurdles I had to overcome to get everything working properly: namely, figuring out all of the specialized behavior necessary to flag imported/created design notes properly and then the arcane incantations necessary to get this OSGi environment working outside of the Domino server.

Seeing the Familiar in SwiftUI and Combine

  • Jun 11, 2019

Apple announced quite a bit at WWDC last week, and some of the stealth favorites for programmers have turned out to be SwiftUI and Combine. My macOS and iOS development is incidental at most, but I like to pay attention to this stuff, and I think that these frameworks in particular are notable for how they reflect attributes of other languages and frameworks that I do use.

SwiftUI

Before beginning, I should point out that the best source of information about SwiftUI at the moment is Apple's WWDC video archive, in particular "Introducing SwiftUI" and "SwiftUI Essentials".

SwiftUI is generally summed up as a "declarative UI framework", with "declarative" here primarily setting it apart from imperatively-defined UI. Truth be told, a mix of both has been common in both Cocoa development and elsewhere for a very long time - the definition of UI layout in .nib files comes from the old NeXT days, for example. The things that set this apart are some technical improvements and a switch of focus from your program creating and managing a UI to instead your program defining what it wants the UI to be and letting the framework handle it.

Declaring the UI

The best way to think of the way this works is that, instead of actively calling methods and setting properties on a window, some buttons, etc. on the screen, what your code does in SwiftUI is instead create a work order for what it wants the current state of the UI to be and hands that to the framework to figure out what needs to happen to get there. That "what needs to happen" can range from initially creating the layout to figuring the right way to find the difference between the new states, removing old elements, adding new ones, and animating transitions between them.

The most immediate analogue for this in common use today is React, which has essentially the same model. In (presumably) both SwiftUI and React, your job as a programmer is to have a method on a component object that is called frequently to emit what it thinks its contents should be right now. So if you have, for example, an array of objects that's displayed as an unordered list, you'll have a render() method that looks like:

render() {
    return (
      <ul>
        {this.props.items.map(item => (
          <li key={item.id}>{item.text}</li>
        ))}
      </ul>
    );
  }

In SwiftUI, a cut-down version would look like:

var body: some View {
  List(model.items) {
    Text(item.text)
  }
}

And, as I mentioned, the core concepts aren't new. In core XPages, we'd write something very similar:

<xp:repeat value="#{items}" var="item">
	<xp:this.facets>
    	<xp:text xp:key="header" contentType="HTML" value="&lt;ul&gt;"/>
    	<xp:text xp:key="footer" contentType="HTML" value="&lt;/ul&gt;"/>
  </xp:this.facets>
  
  <li><xp:text value="#{item.text}"/></li>
</xp:repeat>

However, the last one differs in a number of ways, not the least of which is that the UI-generation part doesn't retain a concept of shifting state. You, the programmer, may know that the items list may grow or shrink by a value, but, in the XPages model, the browser just starts with one block of HTML and replaces it with another on a refresh. Internally on both server and browser, I'm sure there's some retained memory for efficiency's sake, but a removed list item won't, for example, know to gracefully fade out and let the remaining ones shift into its place.

SwiftUI drives this distinction home by preferring UI component definitions to be "structs". Java doesn't currently have an analogue to structs, but they come from C and they're effectively a "pure data" type. The distinction between classes and structs gets a little murky in Swift, but the core idea is that structs are meant to show pure data in a given state and are copied when passed around. That relates here because your SwiftUI component's job is not to be the UI, but rather to emit in-memory blueprints for what it wants given the current state of the data. When data changes, the framework asks the component for a new representation, compares the old one and new one in memory, and then makes changes to the UI representation as appropriate.

It's kind of an odd distinction to write out, but I found that there was a point when I was learning React when it "clicked" in my head.

Separation of UI Definition and Output

Beyond the simplicity of declaring the UI, one of the big things that the SwiftUI presentations drive home is how one UI definition can be used across all of Apple's platforms. A list is a list is a list, and the fact that it looks one way on a watch and another way on a TV isn't inherently important.

Seeing this made me feel pretty good, since it reminded me of a post a wrote years ago that dealt with the component/renderer split in XPages at a conceptual level. Though renderers in XPages don't go so far as spitting out an AppKit application at runtime, the concept of an abstractly-defined UI that is then rendered differently based on the target is very important.

What remains to be seen with SwiftUI is how clean this distinction can remain. In my XPage example, I used some of the ExtLib's semantic form components, but real XPage applications are almost always mucked up with inline CSS classes and styles, client and server JavaScript, meaningless HTML tags like <div>, and so forth. It's one thing to say "oh, this just means a checkbox on macOS and a toggle on iOS", but another to handle a complex, crafted user interface that may be radically different in different contexts.

DSLs

SwiftUI is written in the full Swift language, but it's best described as a domain-specific language written in Swift. The term "DSL" originally meant a whole language dedicated to a small task, but nowadays usually refers instead to a style of writing an API that, when paired with a general-purpose language, acts like it's a language dedicated to the task. This is usually a feature of "scripting"-type languages, where the syntax is clean and flexible enough to not interfere.

In the Java world, the go-to language for this has long been Groovy. Darwino uses Groovy for the database adapter DSL, and SmartNSF does as well for defining services. The most common use nowadays is probably Gradle, the Maven-competing build system. It uses Groovy's closures and parentheses-less method calls to make a build script that looks more like a configuration file than a script:

plugins {
    id 'java'
    id 'application'
}

repositories {
    jcenter() 
}

dependencies {
    implementation 'com.google.guava:guava:26.0-jre' 

    testImplementation 'junit:junit:4.12' 
}

mainClassName = 'demo.App' 

The key reason why these DSLs are useful (other than not having to write a new parser) is that you have the full abilities of the underlying language at your disposal. In SwiftUI, you can do your "hide-when" logic by using the normal old if structure, rather than having a specialized "when should this show up?" property. That also means that you can bring in whatever other logic you have, third-party libraries, and so forth, without having to have specialized support in the API.

Combine

Combine, in addition to having an ominous name, is a less-flashy addition than SwiftUI, but is nonetheless important and also shows the integration of growing themes in programming elsewhere, specifically reactive programming. It's one of those topics where you can very easily fall off a conceptual cliff, and I found even Apple's introductory session to make it sound more daunting than it is by focusing on the specific Swift protocols in practice.

I've found that the best way to learn something like this is to set aside the "asynchronous" aspect at first to focus on the "data flow" part. For this, we're in luck, since this is what Java 8 streams are. Streams are all about starting with some source of data - often just a List implementation, but it's intentionally arbitrary - and changing, filtering, sorting, and otherwise manipulating it to get a result. So a prototypical example from Java can be:

Stream.of(10, 1, -3, 5)
	.filter(i -> i > 0)                  // [10, 1, 5]
	.map(i -> i * 2)                     // [20, 2, 10]
	.sorted()                            // [2, 10, 20]
	.map(String::valueOf)                // ["2", "10", "20"]
	.collect(Collectors.joining(", "));  // "2, 10, 20"

Most of the time in practice, the actual implementation will pretty much match the sequential reading of this path and so will be roughly similar to if you wrote it out with for loops, if statements, and so forth. However, because your code is describing what you want done rather than how to do it, there's room here for short-circuits, multithreading, and optimizations for different collection types.

If you look at a snippet of an example from Combine, you can see near-identical syntax in Swift, with the same idiomatic indentation:

let p4 = p1
	.merge(with: p2)
	.append(5) // add 5 to the end of the sequence
	.allSatisfy { $0 >= 1 } // check if all values are bigger than 0
	.count() // how many values: 1

Importantly, the same style of syntax applies when the incoming data is asynchronous and when the final output is similarly out-of-band. This is what all of Apple's Combine examples start with, which is why I think it can seem a bit daunting. But I think the only real switch to make in your head is to slot in the term "Publisher" for the starting provider of your data (originally an array here, but it could be a keyboard or network resource) and "Subscriber" for the code that deals with it.

Admittedly, things can get more complicated there when you need to add in error handling, value binding, and other practical considerations, but the simplicity of the core concept remains.

Overall

For Domino needs specifically and web development generally, SwiftUI and Combine don't themselves matter much. Still, I think it's useful to take times like this to see larger trends and, when you can, bask in the satisfaction of having seen the concepts elsewhere.

My Slides From Engage 2019 - De04. Java With Domino After XPages

  • May 20, 2019

Engage 2019 has come and gone, and I had an excellent time. I also quite enjoyed presenting my "group therapy" session on some options that XPages developers have for the future. In a lot of ways, it was similar to my presentation at CollabSphere last year, mixed with the various new developments I've talked about on here since then:

Engage 2019

  • May 6, 2019

Engage is just around the corner in Brussels, and I'll be joining everyone there, presentation in hand. Specifically:

Dev04. Java With Domino After XPages
Tuesday, May 14 at 16:00 in room E. Mahy

XPages guided Domino web development out of a world of archaic proprietary hacks and into the realm of Java server development and something approximating Java EE. Now that the XPages framework is moribund, the question is: what's next? If you've built up Java skills over the years, you have a direct path to use them in the new modern world, whether via OSGi plugins and new XPages extensions on Domino or standalone Java web projects. This session will discuss the lessons we learned from XPages, how they correspond to newer technologies, and how to bring your existing apps and plugins forward.

If you attended my session CollabSphere last year (by the way, you should go to CollabSphere this year too), you may note that the title is pretty similar to that one, albeit changing "In Domino" to "With Domino". This massive change in preposition reflects how things have only gotten more awkward for us in the intervening year.

So, if you're attending Engage, join me as we work through our sorrows together!

 

Java Grab Bag 2

  • May 3, 2019

Following in the vein of "Java Hiccups", I've had a couple things floating around my head lately that I think collectively make for a good post for Java developers, particularly those working in the Domino arena.

Without further ado:

Map#computeIfAbsent

This is a method that was added in Java 8 and, while it's not as big a deal as the addition of streams, it's one of my favorite additions and something I use very frequently. To give a point of reference, consider this common idiom from an imagined XPages app:

Map<String, Object> applicationScope = ExtLibUtil.getApplicationScope();
if(!applicationScope.containsKey("someVal")) {
  applicationScope.put("someVal", someExpensiveOperation());
}
String someVal = (String)applicationScope.get("someVal");

Essentially, using a Map as a cache for a complex computed value. Java 8 added the #computeIfAbsent method (alongside several similar ones) to do this in one go:

String someVal = (String)ExtLibUtil.getApplicationScope().computeIfAbsent("someVal", key -> someExpensiveOperation());

The second parameter is (usually) a lambda, like the ones used in streams, that takes the provided key as an argument and is only executed if the value does not already exist. Due to the way this was added, most implementations do pretty much the same thing as the first block of code, but you don't have to care about that. Your code gets a bit smaller, the intent is much clearer, and it's less prone to small bugs like changing the key and forgetting to change it in all three places.

Arrays Are Weird

Java's built-in array type is loosely based on C's, and that's reflected in the syntax:

int[] foo = new int[4];
foo[0] = 1;
foo[1] = 2;
foo[2] = 3;
foo[3] = 4;

Like C, they are zero-based, declared with the capacity and not the max index, and cannot be resized. Unlike C, arrays aren't just syntactical sugar on top of pointers, and this manifests immediately in bounds checking. Take this line:

foo[4] = 10;

In C, this will (famously) just write an integer 10 value into whatever memory happens to be just beyond the bounds of your array. In Java, you'll get an ArrayIndexOutOfBoundsException, saving you from the insidious bug. But, since Java arrays are (probably) implemented internally very similar to in C - they're likely contiguous blocks of memory sized to the type - they're still extremely efficient, and so they show up in a lot of speed-critical code.

As speedy and safe as they are, though, they're still pretty unfriendly. For starters, they can't be resized. When you do new int[4] (or use literal syntax like new int[] { 1, 2, 3, 4 }), you carve out that much memory and can't shrink or expand it in-place. You can change the values inside an array, just not its size. To "resize" efficiently, you have to make a new array and then use System.arraycopy to populate the new array with the contents of the old.

This is all why the List interface (with its predecessor class Vector) exists: they serve the same function of "ordered collection of stuff", but allow for dynamic resizing. Because these objects are usually "efficient enough" (ArrayList uses "true" arrays under the covers) while having numerous additional benefits, you should use them as your first go-to and only use arrays if you have a reason.

That's in part because the weirdness of arrays doesn't end with their inconvenience. Array types are actually implicitly-created classes, even when they contain primitive types. So:

int foo = 3; // primitive value
foo = null; // syntax error!
int[] bar = new int[] { 3 }; // Object
bar = null; // legal!

List<int> fooList = new ArrayList<>(); // syntax error - primitives can't be in generics
List<int[]> barList = new ArrayList<>(); // legal!

Object.class == Object[].class; // false
int[].class.isArray(); // not only legal, but true

Java provides two main utility classes for working with arrays: java.util.Arrays (extremely useful for Arrays.asList) and java.lang.reflect.Array (usually only useful in edge cases).

Java Has No Library Versioning System

If you've worked with Java in Designer or Eclipse, you've likely run across this preferences pane or its per-project version:

Eclipse Java compiler settings

These settings affect two things:

  • The syntax allowed in your source (e.g. new ArrayList<>() requires 1.7 or higher)
  • The class file format version (you can think of this like an NSF ODS). You've likely seen the latter in play by receiving an UnsupportedClassVersionError trying to run Java 7 or 8 code on a pre-9.0.1FP8 Domino server.

Conspicuously absent from this short list is anything to do with classes or methods added to the runtime in newer versions. For example, the String class gained a static method String.join to conveniently concatenate strings with a given delimiter. If you're targeting an older Java version but using a newer Java library (as Designer 9.0.1FP10+ does with a default target of 1.5 and JVM of 1.8), you can write a line of code using that method without issue - the syntax doesn't require anything above 1.5, so all is clear as far as the compiler is concerned. But if you then try to run that code on an older JVM (such as an older Domino server), you'll get an exception at runtime, since the method doesn't exist.

Unfortunately, the only true answer to this is to tell your IDE about a JRE for each specific Java version you're targeting, something that doesn't happen by default, and which will gradually get more difficult as Java 6 becomes harder and harder to come across.

This is one of the things that OSGi aims to fix - you could, for example, have many versions of Guava installed, and you could declare that your plugin works specifically with version 18. Then, when loading, the runtime will either bind to a matching version or give you an error that no version could be resolved. No mystery involved. Unfortunately, OSGi is a niche thing losing ground, and the module system introduced in Java 9 consciously does not address this.

In a pinch, you can use the file tool on most Unix systems to check the version of an individual class file:

$ file Foo.class
Foo.class: compiled Java class data, version 52.0 (Java 1.8)

Licensing

A little while ago, Oracle raised a bit of a stink by declaring that, as of this year, commercial use of their Java runtimes would require paid licensing. Historically, you could get support for Java for money, and certain additional components had their own licensing requirements, but it was pretty normal otherwise to install Java from java.sun.com and not give it a second thought.

This naturally caused a few questions when it comes to Domino and other Java-incorporating IBM products, and IBM released a statement that basically amounted to "you don't have to worry about it". IBM has maintained their own variant of the JVM (called J9 for Smalltalk-related reasons, not to be confused with Java 9) and anyway has always had arrangements with Sun/Oracle such that IBM's customers don't have to worry about dealing with Oracle directly.

But what about using Java outside of a licensed product, such as if you just want to run Tomcat on some server? The short answer there is that you're still fine, but you just have to know a little about the difference between "a JDK" and "Oracle's JDK". Java and the surrounding JDK have been progressively open-sourced in fits and spurts over the years, and are now at a point where the project called OpenJDK is basically the real Java environment, and then Oracle's JDK is just one implementation of it. It's similar to Linux: the core parts are open-source, and many distributions are entirely free, but there also exist commercial variants for pay.

So, if you want to run a Java stack, you can do so without putting forth a single cent by using an OpenJDK build. Oracle, IBM, and others will still be happy to take your money if you want a commercially-supported Java environment, of course.

For a longer explanation, this blog post from @javachampions is pretty much the definitive word.

 

Bitwise Operators

  • Apr 26, 2019

In the "Java Hiccups" post, I briefly mentioned this little tidbit:

(short)Integer.MAX_VALUE == -1   // true

This fact betrays a lot about how numbers are stored internally in most languages. Some of those details, like the fact that the specific method is called two's complement, are fully in the range of "computer science" stuff - but it can be handy to know about a couple ways you can put this sort of thing to use.

For our purposes, let's work with the short data type in Java, because it's painfully important to Notes. "Short" is called such because it's smaller than a standard integer type in a given language. In some languages, like C, the length of core data types like this is sometimes tied to the bitness of the processor, but in Java they're consistent across everything. Specifically, short in Java is 16 bits long. Let's take the number 21,038, which is represented in binary as:

0101 0010 0010 1110

That's two bytes, and each byte is broken up into two groups of four bits each because it turns out to be helpful visually and because each "nibble" there can be matched to a single hexadecimal character (522E in this case). This is the most common way you'll see binary written when you start getting into this sort of thing.

So I mentioned casting in the previous post, and what casting does with primitive integer types like this is to chop off the bits that don't fit. If you cast the above value into a byte, it'll chop off all but the ending 8 bits, giving you:

0010 1110

Which is 46. If you cast it to a larger type, such as int (32 bits in Java), it just adds some zeros to the front:

0000 0000 0000 0000 0101 0010 0010 1110

Which is still 21,038, but now it takes up twice as much memory.

For our uses, this stuff matters immediately in two ways: it represents the limits of some data storage and it allows us to manipulate numbers with bitwise operators.

Storage Limits

If you max out the bits in a 16-bit integer, you get 1111 1111 1111 1111, and this value is either -1 or 65,535 (64k) depending on whether or not the value is considered "signed" by the code using it. When it's "unsigned", a 16-bit integer can represent between 0 and 65,535; when it's "signed", its range is -32,768 to 32,767 (32k). Java only has "signed" types, which is why Short.MAX_VALUE is 32k and not 64k. C, on the other hand, lets you choose in your code which type you want to use.

The Notes C API uses a couple type names to represent consistent sizes across different platforms, and the one that's important here is WORD: a 16-bit unsigned integer. This is used in, for example, the function used to set the value of a text item in a note:

NSFItemSetText(
  NOTEHANDLE  hNote,
  const char far *ItemName,
  const char far *ItemText,
  WORD  TextLength)

That WORD at the end there is where you tell the API how much text you're providing, and is thus capped at 64k - and is why you can store 60k of text in a non-summary text field but not 70k. As to why summary data is limited to 32k and not 64k, I'm guessing that either there's a signed value in there somewhere or that last bit was needed for something else.

If you look over the list of Notes limits, you can see this reflected all over the place, along with a few 8-bit (0-255) limits for good measure.

Bitwise Operators (for real this time)

The term "bitwise" refers to a handful of operators that deal directly on the bits of a number. Java takes a cue from C here and provides &, |, ^, ~, <<, >>, and >>>. These all have their uses, primarily for really-low-level stuff and efficiency, but we mainly care about & and |. These two may have bitten you in the past because of their similarity to && and || - and in particular because Formula Language uses the single-character versions to mean what Java means by the double-character ones. They also kind of work the same way, but aren't really the same.

To see how they work, let's take our original starting number, 21,038, and pair it up with another value, 5,000:

0101 0010 0010 1110
0001 0011 1000 1000

Visualizing the numbers in binary and stacked like this comes in extremely handy when it comes to dealing with bitwise operators. We'll start with &, or "bitwise AND". What this operator does is take two numbers and return a number where the matching slots in each one both contain a 1. So, in this case, it results in this internal math:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0001 0010 0000 1000

Which corresponds to 4,616.

The | operator, or "bitwise OR", will provide a result where either of the original numbers has a 1 in the slot. In this case:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0101 0011 1010 1110

Or 21,422.

It's pretty rare that you want these operators for the actual numerical values, though. What they're pretty much always used for are "bit fields", an extremely-efficient way to store a set of related flags. Going back to the Notes API, the NSFItemAppend function (a generic way to add an item value) has a parameter WORD Flags, that matches up to a number of properties that you can assign to an item. In the C source, they're written as hexadecimal values, but I've added in the binary versions here:

#define	ITEM_SIGN          0x0001   // 0000 0000 0000 0001
#define	ITEM_SEAL          0x0002   // 0000 0000 0000 0010
#define	ITEM_SUMMARY       0x0004   // 0000 0000 0000 0100
#define	ITEM_READWRITERS   0x0020   // 0000 0000 0010 0000
#define	ITEM_NAMES         0x0040   // 0000 0000 0100 0000
#define	ITEM_PLACEHOLDER   0x0100   // 0000 0001 0000 0000
#define	ITEM_PROTECTED     0x0200   // 0000 0010 0000 0000
#define	ITEM_READERS       0x0400   // 0000 0100 0000 0000
#define ITEM_UNCHANGED     0x1000   // 0001 0000 0000 0000

So you can see that each flag there has a distinct slot filled in that each other doesn't (you can also see, like missing teeth, the bits that are probably obsolete or undocumented features). If you want to create an item that is a signed, sealed, summary, readers item, you'd do this:

WORD flags = ITEM_SIGN | ITEM_SEAL | ITEM_SUMMARY | ITEM_READERS

…which will result in this bit of internal math:

0000 0000 0000 0001
0000 0000 0000 0010
0000 0000 0000 0100
0000 0100 0000 0000
===================
0000 0100 0000 0111

This is extremely efficient, both because you can store a bunch of information in just 16 bits, but also because working with these bit fields is baked into processors at the lowest level. There's not much you can do on a computer that's faster, in fact.

When you have a value like this, you can query it for whether or not a given value is set by using the AND operator:

flags & ITEM_SUMMARY

This will result in:

0000 0000 0000 0100

…which is the same as the value of ITEM_SUMMARY itself, since that one slot is the only bit in common. Conversely, if you check:

flags & ITEM_PROTECTED

…you'll end up with zero, since the flag doesn't match.

Use in Java

Most of the time, you don't have to think too much about the bits that make up numbers in Java, but they creep up from time to time. Generally, the main situations where they arise are when you're dealing with a low-level API or with code written by someone who spent a lot of time with C and deeply internalized its efficiencies.

For an example of the former, take a look at the com.ibm.designer.domino.napi.NotesNoteItem class in IBM's NAPI. It contained a method called int getFlags(), which returns exactly the value we were talking about above. If you get your hands on a summary item this way, you can do this to tell if it's a summary item:

(item.getFlags() & 0x0004) != 0

Note the extra != 0. While the operation involved is the same as in C, C lets you treat any non-zero integer value as a boolean true, while Java forces you to perform an actual boolean operation.

For an example of the latter, turn your eyes to the DefaultColumnDef class used in xe:dynamicViewPanel generation and customization, a snippet of which is:

public static class DefaultColumnDef implements ColumnDef {
      public static final int FLAG_HIDDEN         = 0x000001;
      public static final int FLAG_LINK           = 0x000002;
      public static final int FLAG_ONCLICK        = 0x000004;

      public int flags;
  
      public boolean isHidden() {
          return (flags&FLAG_HIDDEN)!=0;
      }
      public boolean isLink() {
          return (flags&FLAG_LINK)!=0;
      }
      public boolean isOnClick() {
          return (flags&FLAG_ONCLICK)!=0;
      }
}

This could also be represented as a bunch of boolean properties on the object or, most idiomatically, an EnumSet, but using a bitmask is slightly more space- and CPU-cycle-efficient. Slightly. The difference isn't enough to even think about normally, but can become worth it in cases where the code is called so often that tiny efficiencies can make a difference.

This whole topic is the sort of thing that is normally so many layers of abstraction below where we work that it's basically invisible, but it's definitely good to at least know the basics for the times when it does come up.

XPages on Android

  • Apr 15, 2019

Around the start of the year, I had a bit of a dalliance with the idea of running XPages outside Domino. The upshot of that project is that it is indeed possible to do so, but there'd be some work to do to make it practical.

This month, we revisited the idea with a healthy dose of Darwino to provide some undergirding technology, with the goal of being able to run a plain old XPages app on mobile devices backed by Darwino DBs and replicating back to Domino. There was a lot of fiddling involved, but it works:

Android is the natural first target, but iOS is about 70% there, with the scaffolding loading up to the point where it loads a page but currently with a bit of trouble when it comes to resolving data classes and executing renderers.

What Specifically Is Going On?

We set out a few required parameters to call it a successful proof-of-concept:

  • It has to use the actual XPages framework - that is to say, the jar files shipped with Notes/Domino
  • The XPages themselves have to be shared among Domino and the Darwino app, in the form of their Java "intermediate" source (compiling from .xsp source is possible but is a whole other thing)
  • It has to use xp:dominoView and xp:dominoDocument data sources
  • It has to load the Extension Library
  • It has to use ancillary elements from the app: managed beans, themes, CSS, images, SSJS libraries

This checks all of those boxes, and it's pretty satisfying to see in action.

The Stumbling Blocks

As I discussed in my post about the original project, there are aspects of XPages that are meant to make this sort of thing possible, with the core parts abstracting out different platforms and runtime environments. Over the years, though, assumptions about Domino and OSGi crept in, with newer additions and the Extension Library taking a bit less care to be environment-neutral.

Moreover, XPages is not open source and I don't have any particular special access to it, making this whole thing essentially, in the video-game sense, hard mode. There are a handful of classes that needed to be outright swapped out, like the annoyingly-final NotesContext class, but there was much less of that than I'd thought.

After that, getting the data to point to Darwino was pleasantly straightforward. Other than a handful of areas where the NAPI comes in, the Domino data sources largely adhere to the rules of the lotus.domino interfaces and don't make assumptions about the implementing classes (this is essentially how ODA shims in as well).

What This Could Be Useful For

With the right fleshing out, this could be a real way to run existing XPages apps on mobile devices. That could be pretty useful on its own, but I don't think that carrying forward XPages apps as-is is the right idea. The side effect of this, though, is that you have a functioning XPages app in a normal old .war file project, structured with Maven or Gradle, and ready to be molded into newer frameworks using whatever tooling you'd like. No Designer, no OSGi, no Servlet 2.4/2.5, just a clean basis running your existing logic and ready to be improved.

If the mountain of existing XPages code is going to have a future, I think it should be something like this.

Implementing Cluster Replication From Domino to Darwino

  • Mar 19, 2019

Since its inception, Darwino has had two-way replication between it and Domino, and it's evolved over the years in fidelity and configurability. Recently, I was able to check an item off the to-do list that I've wanted for a while: "cluster-style" replication from Domino, where a document change immediately kicks off replication to Darwino.

Component #1: Extension Manager

Fortunately, the implementation is straightforward in concept: the Extension Manager has been in there since 4.0 and provides exactly the hooks one would need to implement this. The trouble with the Extension Manager, though, is that you can only subscribe to events from within an ExtMgr addin, and that means native code. You can't just hook into it from, say, an OSGi plugin.

My first thought was to use DOTS, which created this exact sort of bridge years ago. However, it's critically limited: the EM ferrying was never really fleshed out that much, and nor will it ever be, since it's not a supported project. It was kind-of-sort-of supported in the 9.x era for "social" purposes, but those days are behind us. Moreover, its separate OSGi environment wouldn't suit Darwino's needs particularly well.

The DOTS dynamic library, though, could still be potentially useful. Nathan Freeman came across this a couple of years ago: due to the way the DOTS dylib ferries the events from the ExtMgr world to DOTS, the channel is actually consumable by anything, not just DOTS specifically. My initial implementation did exactly this: it fed from the fountain of messages produced by the DOTS dylib for its own ends.

However, the core trouble still remains that DOTS isn't supported as such, and it has too many moving parts for us to want to take on as a dependency. Moreover, the "siphoning" only works if there's only one subscriber listening - on a server that also runs ODA (which Darwino does not use), you have the two consumers contending for messages, which is a recipe for missed events.

So I decided to write a custom-made ExtMgr addin, which would have the advantages of being much smaller and easier to maintain, feeding a different queue, and being a fun learning opportunity for me. The last part isn't as important to Darwino-the-product per se, but I always like when it lines up like that.

Component #2: Message Queues

The way DOTS and this new addin do their things is to use Message Queues, another technology presumably-not-coincidentally added to Domino in R4. The way these work is that you create a named queue (DOTS's, for example, is named MQ$DOTS) and then feed it strings, which are then consumed by anything running in the same Notes/Domino environment by requesting the queue of the same name and waiting for messages. It's pretty simple both in theory and in execution, with the minor problem that the API isn't officially available from Java.

Fortunately for ODA's (and presumably DOTS's) use, though it's not part of the official API, there is a lotus.notes.internal.MessageQueue class that is a (shockingly-thin) wrapper around the MQ* functions in the C API. It's functional, and the thinness of the wrapper means that the method parameters, though unnamed in the bytecode as usual, are clear matches for the equivalent C parameters.

I initially started using this, but ended up writing a nicer wrapper in Darwino's NAPI that implements BlockingQueue, making consumption in Java much clearer.

Component #3: The Listener

The final piece was the most comfortable, since it's entirely back in the warm embrace of Java: I wrote a class to listen for events coming through this queue, extract the database name, check for replicators configured for that NSF, and immediately kick any applicable ones off.

The End Result

Since the overhead of the work in a small replication is so minor, the end result is effectively the same as cluster replication between Domino servers: the change/creation/deletion is propagated over within a couple milliseconds (for small documents, due to, you know, physics). It's pretty satisfying to see in action.

Pub/Sub

If you've been following HCL's announcements lately and feel like this sounds very similar to the pub/sub support they've slated for V11, you're right. My guess is that they wanted to get Elastic Search working with low latency and (kindly and wisely) decided to turn the work required into a nicer interface for the same EM events we're using. It's a good feature and, assuming it's consumable from local Java, would have made my work here easier, but I didn't want to wait, and we target a couple Domino releases back anyway.

Anatomy of a Clean Open-Source Project

  • Mar 18, 2019

Over the years, initially thanks to Peter Tanner's diligent work as the OpenNTF IP Manager and now my own occupation of that seat, I've learned to really appreciate the virtues of dotting your "i"s and crossing your "t"s when it comes to making an open-source project legally clean and clear.

It's definitely something I underrated early on, though - caring about the specific differences between licenses and, in particular, maintaining things like per-file license/copyright headers felt like annoying busywork. For a project that only you will ever use, it technically is, but the hope of open source is that you'll get other people using your work and, ideally, contributing back in turn, and that's when it's important to make sure you have everything sorted out.

Why Bother?

Well, for one, you or your users could theoretically be sued or otherwise legally entangled if you don't keep track of this stuff. Admittedly, it's fairly unlikely, but the consequence of, for example, unknowingly including GPL software in your proprietary product is potentially significant.

It's for that sort of reason that it's important to make sure everything is clean before some large corporations will risk even looking at your project. IBM is particularly good about this because they were significantly burned in the past, and came out of it with extremely-strict view, and that rubbed off on OpenNTF both culturally and with their gracious technical assistance along the way.

And, since large consumers require this sort of vetting, it's also important to know how to do it if you want to contribute code to an open-source organization like OpenNTF, Eclipse, or Apache.

It's also surprisingly satisfying once you get into the swing of it, I've found.

The Example Project

Since I've been spending a lot of time recently with the NSF ODP Tooling project, we'll look at its GitHub repository.

Common Files

There are a couple common features that tend to show up, and which both people and tooling (like GitHub's license identifier) look for:

  • The LICENSE file, which is the most critical. This contains the text of the license you're using, as well as one of the declarations of the copyright year (though, admittedly, it's easy to forget to include that part). This is what declares the effective license for the code in the repository that you own the copyright to, and should be included right from the start if possible.

  • The NOTICE file, which is vital if you're including any code from any sources not covered under the main copyright. This file should list all of the third-party code you have included in the repository, its license type, and, if possible, where to acquire it. If your project's distributable form includes additional third-party code not included in the repository (such as Maven or npm dependencies), these should be enumerated here as well

    • Writing this file has an important side effect in that it forces you to account for the licenses of your dependencies. More than once, I've run into a situation where I found that a common dependency had an incompatible license (such as the pure GPL). In some cases, this has meant abandoning the dependency outright, while in others it has meant finding a better-licensed alternative. Eclipse Orbit exists in large part for this purpose.
  • A legal directory containing any additional license/redistribution information not covered by the NOTICE. This can also sometimes take the form of files like NOTICE-Weld in the root of a project, and is useful for mass-including copyright/notice information from third parties in their original form.

In addition to including these files in the project repository, you should also make sure to include them in any binary distributions you make. In my projects, this takes the shape of inclusions in a Maven Assembly Plugin packaging file.

File Headers

I originally chafed against the idea of per-file copyright/license headers. They're not strictly necessary when the files are included in the original repository, they're redundant, you end up with massive commits touching hundreds of files just to change a year, and they can dwarf the size of the actual code they're copyrighting.

However, I've really come around to the practice of including them, and the main reason is that it makes the files easier for others to copy and use legally. It's one thing when someone finds their way to the root of your repository or distribution package, but it's another when they find an individual class by doing a web search or hitting F3 in Eclipse. In those cases, they can find their way up to the license (assuming your source package includes it), but it's much easier if it's just declared right up front.

It's also easier to clearly distinguish the third-party code you're including. When each file has its copyright information clearly noted, you can easily tell the difference between a sui-generis project file and an included third-party file without having to parse through the NOTICE every time.

And, fortunately, it doesn't have to be a huge hassle to maintain. In each of my Maven projects, I include a license plugin configuration to declare copyright information, any special data types, and which files to not include. Then, whenever I add new files or make a change after the turn of a year, I can run mvn license:format and it'll keep everything tidy for me.

pom.xml Configuration

Maven (and it's not alone in this) provides a lot of pom.xml-level elements to declare all sorts of metadata about your project, like its SCM repository, issue tracker, and, critically, license and developers. I like to declare the inception year, the license, and the <developers> block:

	<inceptionYear>2018</inceptionYear>

	<licenses>
		<license>
			<name>The Apache Software License, Version 2.0</name>
			<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
		</license>
	</licenses>

	<developers>
		<developer>
			<name>Jesse Gallagher</name>
			<email>jesse@frostillic.us</email>
		</developer>
	</developers>

I use that <inceptionYear/> value as part of the license-header file to keep track of the copyright range, at least when there's contiguous multi-year development.

OSGi Stuff

Since most of my projects are still OSGi, I've been aiming to improve my licensing setup there too. The main place where this comes into play is in feature.xml files, which have required elements to specify the copyright and license. It's not terribly unusual for these to end up as their "[Enter Copyright Description here.]" defaults, but it's important to fill these in. They're included in the "accept the licenses" dialog when installing into Eclipse/Designer, and are available in the "Installed software" descriptions in the UI.

But What License to Use to Begin With?

I'll finish off this post with what is actually the most important part of the process, but which can usually be answered simply. There are a lot of open-source licenses out there, and you could theoretically make up your own, but for our purposes the choice tends to follow some basic rules:

  • If you're contributing to an established OS organization, use theirs - for example, if you're contributing to Eclipse, use the EPL.

  • If you want your code to be mixed other OS projects and (potentially) proprietary ones, pick Apache or something like it.

    • At OpenNTF, we have a preference for Apache over other similar licenses, because it's well-established and makes copyright handling clearer than the equivalents, something that is critical for large companies. Let past lawyers do your legwork on this one.
  • If you want to require that users of your code keep the code open source, consider the GPL.

    • Be extremely wary of this, however: the GPL is intentionally "infectious" and limits how the code can be used. Various projects carve out little exceptions to the GPL to allow use in otherwise-non-GPL products, but it's still something of a minefield.
    • The GPL is one of the approved licenses for OpenNTF, but we kind of discourage it except in cases where a project is GPL because it's derived from previously-GPL'd code.
  • If you don't want to be bothered too much by copyright and just want the code out there, consider Public Domain. In practice, it's usually best for you to retain copyright, but explicitly declaring Public Domain is certainly an effective way of allowing any use.

For projects in our community, the quick answer is "use Apache". It's permissive, covers copyright, and is known and trusted by pretty much everyone.

More Work Than It's Worth?

Both the earlier parts of this post and Betteridge's Law contribute to making it clear that my answer is "no, it's not more work than it's worth", but I can certainly see why it'd feel that way. The first couple times I submitted projects to OpenNTF and got a "here's some stuff to fix" email from Peter Tanner, part of me definitely chafed at the whole thing. That can be particularly the case for Notes-based contributions - sometimes, you just want to plunk an NTF on the project page and be done with it, and Notes certainly doesn't have a "wrap this NSF copy in a ZIP with LICENSE and NOTICE files" checkbox.

However, as I learned more about the legal importance of having licenses correct and got more practice at doing this stuff from the start, I started to appreciate the whole process. It also turned out to be really helpful to sort this stuff out on smaller projects before working on larger ones, especially ones with established teams and procedures.

In all, it's worth it both to allow you to contribute to larger projects and, regardless of project size, it's worth it for anyone consuming your code.