How the ODP Compiler Works, Part 5

Jul 5, 2019 12:06 PM

Tags: nsfodp
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

One of the things that came up frequently when writing both the compiler and exporter portions of the NSF ODP Tooling was rationalizing the multiple ways an NSF is viewed, and determining which aspects are reified in the design notes themselves and which are entirely runtime conjurations.

The Traditional View

To describe what I mean, I'll start with the "traditional" way that design notes work, which is also the mechanism the other views are built upon. The starting point there is the distinction between data and design notes, represented in the API as the note class. For our purposes, there's "data note" and "everything else". The design notes are kept track of internally by what can be considered a magic view, the design collection, which is used implicitly whenever something looks up a design element, and can be accessed automatically by API calls like NIFFindDesignNoteExt.

The design collection itself acts like a normal view, containing columns with pertinent design element information for fast lookups. Beyond the note class value, design notes are distinguished by character-based flags, which you can see in the "Fields" part of the property pane in Designer in the $Flags item. These will look something like "gC~4K" - this value comes from an XPage, and can be interpreted by taking each character and looking for it in "stdnames.h" from the C API:

  • g is DESIGN_FLAG_FILE, referring to a "file resource"-type design element (more on this later)
  • C apparently matches to DESIGN_FLAG_NO_COMPOSE, used to refer to forms that don't show up in the "Create" menu. I'm not sure why it's included here; it may have some second meaning
  • ~ maps to DESIGN_FLAG_HIDEFROMDESIGNLIST, presumably to keep XPages out of standard File Resource pickers
  • 4 maps to DESIGN_FLAG_HIDE_FROM_V4, which is reasonable advice, but the fact that this is 4 and not 7 makes me suspect there's a second meaning here too
  • K maps to DESIGN_FLAG_XSPPAGE, cheerily documented in the API as "an xpage, much like a file resource, but special!"

The importance of these flags and the reuse of some note classes (file resources in particular) bares a bit of the evolution of the platform. The older the note type is, the more likely it is to have a dedicated note class value. Forms, views, ACLs, and other primordial elements have eponymous classes, but, starting around the web era, new elements started piggybacking on existing classes. This has so far culminated with the XPages-era additions, where almost everything is considered a "file resource", which are themselves already specialized "forms". This mirrors the evolution of file data stored as Composite Data structures, where the first file types added in got their own dedicated structures, later types (like JavaScript libraries) were either crammed awkwardly into similar types or just plopped in as CDFILESEGMENTs (which Domino adorably refers to universally as "CSS").

With the heavy use of flags came something of a mini query language to distinguish collections of design notes. In the API, you can see these in the DFLAGPAT_ C constants. For example, DFLAGPAT_FORM maps to "-FQMUGXWy#i:|@0nK;g~%z^" - the - at the start means that this is a "none of these" matcher, so it's resolved by looking up all notes with NOTE_CLASS_FORM and then filtering out any of the ones with those flags. From our XPage example, you can see it's excluded thrice over, via g, K, and ~. There are other permutations in the language for "match all of", "match any of", and combinations of all three types, and the patterns allow you to select each type of design element you see in Notes and Designer, and a few more categories besides.

Designer's View

Designer really has two views of the NSF. The first view is essentially a codification of what's above, and has been how Notes and Designer have worked forever. When you go to the "Forms" list in Designer, it does a query in the design collection similar to the above form example, and each category of design elements has its equivalent query.

Its second view came along with the Eclipse transition, and it's what you see in Package Explorer. This version takes the core querying capabilities of the design collection and maps it on to an Eclipse File System plugin. From Eclipse-Designer's point of view, the NSF becomes a file-based project as if it was a set of folders and files on the filesystem, but is in reality composed of some dynamic lookups from the design collection paired with a truly local temporary directory for the Local XPages compilation scratch area.

This view of the design became the basis of on-disk project support, with the ODP mirroring what you see in the virtual Eclipse project.

It's also where we start to see a secondary hierarchy within the database design. Traditionally, design notes are largely "flat": while the UI and some APIs have special support for the use of \ within design element names, there's no concept of containers beyond the main categories. For XPages, though, they started to add items like $ClassIndexItem, which contain values like "WEB-INF/classes/frostillicus/controller/ControllingViewHandler.class". These files show up within the "WebContent" folder in the virtual project - "WEB-INF/classes" is by default hidden in Package Explorer, but you can see it in the Navigator view. The use of "WebContent" as a folder for this is itself a holdover from Eclipse-based web app development.

Domino's View

Like Designer, Domino has two views of an NSF and the first is a pretty direct use of the design collection. It has simpler needs, usually just looking up elements by type + flags + name, using reverse view order for web elements.

The second view can be thought of as a stripped-down version of Designer's VFS, but it isn't implemented in the same way and doesn't include all the top-level folders you see in Designer. Instead, Domino uses the aforementioned Java class index items and some other existing values like file-resource names to compose something that resembles a WAR file - you can see this reflected in its use of "WEB-INF/classes". It's this view of the NSF that the XPages runtime container and its many abstraction classes use, allowing them to treat it as an app container in the same way as a normal JEE web app, as if it was just another WAR file. It's not treated fully the same as a WAR file - you can't plop a web.xml file in there and use some other web toolkit - but that's the concept that the XSP stack is going for in classes like com.ibm.domino.xsp.module.nsf.NSFComponentModule.

The On-Disk Project Version

As I mentioned, the ODP is based closely on the Designer view, which in turn is partially based on the "web app" view used for XPages. For the compiler, it's not too big of a deal - it just needs to gather files and import them based on their existing DXL for the most part - but the exporter has to do some fiddly work to shuttle notes to their right spots. By total coincidence, that will be a nice lead-in to my next post, which I expect to cover some details of the ODP exporter portion of the NSF ODP Tooling.

How the ODP Compiler Works, Part 4

Jul 3, 2019 11:33 AM

Tags: nsfodp
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

In today's post, I'd like to go over a bit of how the NSF ODP Tooling project is organized, and specifically how I structured it to support both server-based and local compilation.

Setting aside the feature, update site, and distribution modules, the tooling consists of seventeen code-bearing components:

For our purposes today, we care about the first six in the "plugins" directory and then the "nsfodp-maven-plugin" at the bottom - the rest have to do with the different capabilities of the suite.

Commons

The three "commons" plugins contain a set of utilities and data-description classes, and they're broken up into those three modules due to differing dependencies. The core "commons" plugin relies only on org.eclipse.core.runtime, while the "dxl" plugin adds an IBM Commons dependency, and finally the "odp" plugin relies outright on a Notes/Domino runtime. By keeping these things distinct, it lets me keep track of which things are safe to include in the Eclipse UI plugins or the Maven plugin, where I can't count on the present of a Notes runtime.

"Servlet" and "Equinox"

The compiler, like the other "action" components of the Tooling, is split up into the core "compiler" plugin that does the actual heavy lifting, and then two "interface" plugins for running the code from different directions.

The "servlet" plugin came first and is the mechanism by which a local Maven-run plugin communicates with a remote Domino server with the Tooling installed. It contains a primary entrypoint servlet that accepts a packaged zip file from the client containing the ODP and any extra update sites to use while building, as well as a set of HTTP headers describing the various parameters that can be set for compilation. Strictly speaking, this plugin doesn't depend on Domino as such, but rather on having a servlet container and a Notes or Domino runtime - it could hypothetically run in e.g. Tomcat with the right dependencies, but in effect it's the "Domino side" of it.

The "equinox" plugin supports local compilation and it's a bit of an interesting beast. Since the compiler is intended to work with any given NSF and XPages application, it has a hard requirement on the presence of an Equinox ("Eclipse-style") OSGi runtime. A local Maven build doesn't use Equinox, so I wrote this plugin to provide what Equinox refers to as an "application" - essentially a named executable class that can be run once you initialize an Equinox runtime. Eclipse itself uses this mechanism for running, and you can see these in action in the "Eclipse Application" run configuration type in Eclipse-the-IDE:

The code itself behaves similarly to the servlet, but can skip the "zip container" step of the process, instead referencing the local files based on system properties set by the Maven bootstrapper.

Having these two entrypoints lets me keep the actual business of the compiler independent. Depending on need, I could add any number of other entrypoints without having to modify the core code at all.

Maven-side

The actual action of the process is kicked off by a Maven plugin, which consists of what Maven calls a "mojo". It's effectively the same idea as the Equinox application: a specially-tailored executable class. In this case, it gains the ability to specify parameters that are passed in by the pom.xml configuration or via the command line, which are then available by the time the Maven runtime calls the execute() method.

When run, the Maven mojo branches based on what type of compilation it can do. The servlet-based compilation branch is a little wordy in the class, but conceptually simpler than the local compilation. The mojo creates a temporary zip file, pours the ODP into it, and then adds in any update sites to include. Then, it creates an HTTP connection to the remote server, adds headers to configure the compiler, sets the POST body to the zip file, and then lets the server do its thing.

The Equinox runner, though... that's something that took a surprising amount of fiddly magic to get working. On a conceptual level, running the compiler in a local Equinox container is essentially the same thing as the Equinox container launched by the Domino HTTP process - same Eclipse runtime, same infrastructure, and so forth. However, the trouble came in both in some of the fiddly ways that the Domino OSGi runtime is configured and in the assumptions it makes about the active JVM, and the battle resulted in a complicated bootstrapping process. The Notes JVM comes packaged with a handful of critical jar files, not the least of which being "Notes.jar", and those need to be added to the active classpath, which in turn needs a specialized provider plugin to get Equinox to see them. There's also whatever the heck "JEmpower" is, which has its own special needs to be wrapped up into a "shim" OSGi plugin because of the way other plugins depend on it. The runner is also riven with special behavior for running on recent macOS Notes builds, which switched to an embedded non-J9 JVM (I wouldn't be surprised if this changes subtly again in the future). This is all in service of creating a compatible Equinox configuration so that, finally, the compiler can be run in a child process. It's not pretty, but it works.

Progress Messages

Since the process can take a while, I created an extremely-bare-bones messaging system, where the server sends out a series of JSON objects delimited by newlines and the client watches for these and emits human-friendly messages. The compiler process itself just uses an Eclipse-style IProgressMonitor - the servlet uses an implementation called LineDelimitedJsonProgressMonitor, while the local Equinox runner uses one that just prints to the console directly. This is another area where things are kept generic enough at the internal level so that a different mechanism entirely could be hooked in - a GUI progress monitor, for example.

Overall Structure

I'm pretty pleased with how the structure of the Tooling has taken shape. Being able to separate out the entrypoints like this definitely made it much easier to have the local compilation, and breaking it all out into multiple modules kept me from baking in any incorrect assumptions about the runtime environment in the core code. I've been toying with ideas for how to get this stuff to run in an Eclipse/IntelliJ/etc. environment, and I think the Maven Equinox runner will provide a pretty good template for that. There'd be a lot of work to make that good, but it'd definitely be possible.

How the ODP Compiler Works, Part 3

Jul 2, 2019 11:26 AM

Tags: nsfodp
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

In the first two posts in this series, I focused on the XPages compilation and runtime environment, independent of anything to do with an NSF specifically. I'll return to the world of OSGi and servlets in later entries, but I'd like to take a bit of time to talk about some specifics of grafting the compiled XPage results and the rest of the on-disk project's contents into an actual NSF.

The Basics

The primary tool that makes an on-disk project work is DXL, the XML representation of a note. DXL defines representations for several kinds of Notes elements, but the three main kinds that you run into with an on-disk project are:

  • Database metadata, found in the annotingly-suffixed AppProperties/database.properties file. This contains information from a couple places, in particular the ACL and icon notes

  • "Raw" representations of design notes. These show up quite a bit if you select "Use binary DXL" in Designer's preferences, and they show up in a couple parts regardless of that selection. These are distinguished by their use of <note/> as the root element, and contain close to raw data from the NSF. Strings, numbers, and dates are represented in human-readable form, but things like composite data/rich text are stored as Base64-encoded byte arrays matching their in-memory C structures. These "blobs" are opaque to work with but are the safest to round-trip.

    • A subtype of this is the "*.metadata" files, which I'll cover shortly.
  • "Encapsulated" design notes, with root elements like <form/> and <view/>. These are friendly to look at and work with programmatically, but the forms in particular run the risk of some edge-case compatibility issues.

The Process

The ODP Compiler uses DXL for almost all of its NSF manipulation, and imports the ODP in a couple of passes based on the different needs of different design elements.

"Direct DXL" Elements

The easiest elements are the ones that are just single DXL files in the ODP and can be imported directly. The compiler iterates over these files as determined by the OnDiskProject class and just passes them in to the DXL importer. Easy peasy.

"Split" Elements

The second main type are resource files that are stored in the ODP as their "normal" file data and paired with a ".metadata" file. The prime example of this are file resources: if you have a file named "foo.txt" stored as a file resource in your NSF, it will exist in the NSF as a normal text file named "foo.txt" and next to it will be a trimmed-down DXL file named "foo.txt.metadata". These metadata files are an export of the "raw" format of the DXL, but then the actual file data items are removed, leaving them contain just the additional items that go along with that (flags, in-NSF file name, etc.).

The conceptual task here is straightforward: encode the file data back into the appropriate composite-data format as Base64 inside the DXL, and then import that. The actual task of doing that, though, gets pretty arcane. There were two ways I could go about it: import the metadata only and then use the C API (via one route or another) to create the structures in memory and append them to the note, or create a C-struct-compatible representation in-memory in Java and add it to the DXL to import. I originally planned on doing the former, as the com.ibm.designer.domino.napi.design.FileAccess class in IBM's NAPI has promisingly-named classes to do this, but I ran into some trouble with some file types that it doesn't support - though file resources, images, script libraries, and others are all conceptually the same thing, the actual C-level storage mechanism for each is slightly different. So I ended up going the latter route, which entailed writing some gnarly code to do it in memory.

XPages Elements

For the most part, XPages and related elements (Custom Controls, themes, Java class files, and Jars) are supersets of file resources: they use the same composite-data structures and store the programmer-visible data in the same $FileData items in the destination notes. Each has an extra layer, though, in order to store the Java bytecode and other info.

Both XPages and Custom Controls share a code path that stores their compiled data into the $ClassData0, $ClassData1, $ClassSize0, and $ClassSize1 items, since they consistently have one class to represent the main page and then a second inner "Page" class to act as an internal component constructor. In addition, Custom Controls store their ".xsp-config" data in $ConfigData and $ConfigSize items in the same note in the NSF.

Java design elements are conceptually similar, but have less predictable class names, and so the code is a little more complex. There's also some special behavior here, in that there are a handful of compiled classes that show up in the compilation result that aren't directly stored in those files. I forget what those are specifically - they might be for secondary, non-public classes that appear at the top level of a Java source file but aren't inner classes.

All of these, in addition to storing their source and class names, also sprout a $ClassIndexItem item that lists the "file paths" for the classes to be used as part of the virtual filesystems that Domino and Designer use when initializing the XPages app.

LotusScript

LotusScript libraries are… special. Though LotusScript embedded in other design notes (forms, views, agents, etc.) doesn't require any special handling beyond importing the DXL, libraries stored as ".lss" files in the on-disk project aren't automatically compiled.

These libraries are brought in with their source stored as normal text items named $ScriptLib, but then need to be compiled from there. There's no mechanism for compiling LotusScript in the normal Java API, and IBM's NAPI doesn't have a binding to the NSFNoteLSCompileExt function involved, so I had to dip into Java C API bindings, initially via Karsten Lehmann's excellent Domino JNA and then switched over to Darwino's NAPI implementation.

If you look at the algorithm I'm using to compile the libraries, you may notice how brute-force it is. Any given library may depend on any given other library, but I don't have a way to know that ahead of time without parsing the code (which I don't want to do). So, in lieu of the kind of dependency graph that Designer creates when you do "Recompile all LotusScript", the ODP Compiler tries each library in turn and, if one fails, it adds it to a "try again" queue. It does this until it's had a chance to effectively try each combination, at which point it will either have a clean queue and can proceed or it'll have one or more libraries that failed to compile for a different reason. It's not pretty, but it gets the job done.

Standalone Elements

There are a handful of components of an ODP that are stored as plain files without associated DXL, generally to do with XPages support files like "xsp.properties". These have some special support in the FileResource class to auto-vivify an associated DXL file on the fly. Fortunately, these files are pretty basic to create, and the only catch was figuring out the appropriate $Flags and $FlagsExt values to fill in. For this, the OnDiskProject class has a set of matchers to match known paths to the specialized file-resource behavior needed for each.

Miscellany

Beyond just importing the ODP files into their right places, the compiler does a few other notable things.

It has the option to populate the $TemplateBuild shared field with template name and build time information, which I've found to be extremely handy. I used to have an agent in a separate DB that would update this in my template DB, and it's much nicer to have the compiler do this automatically. It's also a pleasant fit-and-finish thing.

Similarly, I used to have to remember to take a moment to make sure that the Xsp Properties file in the NTF was set to use compressed and aggregated resources, which was easy to forget. Now, I can have that happen automatically, via filtering during resource import.

Designer exports the "database.properties" file with the current full-text-index status intact, which can actually cause trouble when importing back into a new database. I had to strip that part out if present.

LotusScript compilation relies on the presence of web-service classes in the current Java runtime, which caused trouble when I added local compilation without a Domino server. I guess that this is a knock-on effect, with loading the compiler having the secondary effect of warming the JRE in case you're compiling web services.

Because I forgot that DbDirectory#createDatabase exists (or maybe it was limited, I don't remember), I ended up adding DB creation to Darwino's NAPI, which is a handy capability to have anyway.

Remaining Topics

The more I write about this, the more I find is still left worth covering. In particular, I'd like to go over the architecture of the compiler, how it's architected to run both on a remote server and via a local Equinox OSGi environment. There's also the whole matter of the ODP exporter, which is technically separate but related by workflow and a source of its own bits of arcane knowledge. So much to cover!

 

How the ODP Compiler Works, Part 2

Jul 1, 2019 11:36 AM

Tags: nsfodp xpages
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

In yesterday's post, I briefly touched on how the XPages runtime sees its environment by way of a FacesProject and related components. Today, I'd like to expand on that a bit, since it's useful to understand the various layers of what makes up an "XPages app" at compilation and runtimes.

Designer and Domino largely take two paths to try to arrive at the same location in how they view an NSF. The way Designer works is more complicated and opaque than Domino, with extra layers of VFS and an internal RPC mechanism(!) for editors, but there is at least some shared code from the XSP runtime. Beyond that, it does almost the same thing to determine the project's dependency classpath, while the internal NSF classpath is entirely distinct, using Eclipse's project structure to builds towards the different structure Domino will use.

Libraries

The notion of an XSP Library is one of the main parts of directly-shared code between the server and Designer. The way an XSP Library works is that you create a class that implements com.ibm.xsp.library.XspLibrary and then declare that as an IBM Commons extension contribution (more on that later) for the com.ibm.xsp.Library service type.

The fact that this is live code sitting in a plugin has a significant implication. Namely, anything that interprets it has to actually load the class and its dependencies. This is as opposed to just a static configuration file, which could be read without executing any custom code. For the server, the distinction doesn't matter too much, since you'll want to load all your class files anyway. For Designer, this is where we get the requirement to install libraries into Designer itself, rather than just adding plugins to the Target Platform. This is also an area that's a breeding ground for IDE bugs, since Designer needs the plugin available both internally and in the Target Platform, but they're not inherently tied together.

Though the XspLibrary implementation class is executable code, its main purpose is to point the runtime to various bits of static configuration information: the unique identifier for the library (e.g. com.ibm.xsp.extlibx.bazaar.library), lists of *.xsp-config and *-faces-config.xml files to define XSP and JSF contributions, and a list of other library IDs that this one depends on.

I believe that Designer and Domino use these bits of information slightly differently - I'm not sure that Domino cares too much about the *.xsp-config files, for example - but there's a lot of overlap here.

Configuration Files

The two main types of static configuration files used by libraries serve distinct purposes.

The *-faces-config.xml files (not required to be so named, but it's a good convention) are layered under the faces-config.xml file contained in your NSF. They define managed beans, converters, PhaseListeners, and other JSF-isms. These files come directly from the underlying JSF implementation and share the same syntax, at least until the JSF-1.2-era forking of XPages.

The *.xsp-config files look similar - they also use the <faces-config/> root element - but I believe that these are largely an XSP-specific detail. It looks like JSF 1.2 also uses the same <faces-config-extension/> tag, but to a different end - perhaps this evolution started the same way but then diverged there. In any event, these files are where Designer (and the XSP compilation process in general) looks for custom-defined components and their accessible properties. There's an interesting point to note there: though defined components are effectively beans with properties, Designer doesn't introspect the object to get its property names and types, but instead relies entirely on the definitions found in these files. It will still eventually use the component class when it goes to compile the translated XSP Java files, so they still need to be correct, but it's certainly a spot where it's easy to make a typo or mismatched property type.

I think that the latter files aren't used by the server, since their purpose is to provide the XSP source Java translator with mappings for components' XML elements to the Java classes. However, the core XPages runtime classes on the server still retain knowledge of this configuration, which is how the Bazaar and ODP Compiler do their thing. The com.ibm.xsp.registry package and sub-packages are filled with a mix of parser classes and in-memory representations, like com.ibm.xsp.registry.parse.ConfigParserImpl and com.ibm.xsp.registry.LibraryFragmentImpl.

Non-Library Contributions

Though not related to libraries, it's useful to know about a handful of XPages-specific class contributions that can come into play at runtime. These use the IBM Commons extension mechanism, like libraries themselves, but contribute to a good many different parts of the runtime and application flow. Some of these can be defined inside an NSF, while some are only recognized when defined in plugins - there's a good rundown of these on the ODA wiki. It's pretty rare to see these in the wild, but you may see an application here or there that uses these contributions, via in-NSF files like META-INF/services/com.ibm.xsp.core.events.ApplicationListener.

OSGi and Dependencies

In the early days, XPages was not OSGi-based. That came in in the 8.5.2 era (I believe - I wasn't aware enough in the 8.5.0/8.5.1 era to know the specifics) with the "extensibility API". For the most part, this lineage remains, and the XPages runtime itself isn't too dependent on OSGi, even when it comes to library contributions. Little bits have crept in here and there - the getPluginId() method in XspLibrary and the getOSGiBundle() method in ExtLibLoaderExtension, for example - but it's still largely incidental.

IBM Commons Extensions

If you've done both XPages plugin and Eclipse-the-IDE plugin development, you may have noticed that, while Eclipse plugins usually contribute to customized extension points with complicated schemas, XPages contributions all look like this:

	<extension point="com.ibm.commons.Extension">
		<service type="com.ibm.xsp.Library" class="com.example.SomeXPagesLibrary" />
	</extension>

There are still some places in Domino where you use different extension points, such as when you register a servlet with the Equinox OSGi runtime directly, but for the most part it's just this one point. This is because this extension point is designed to paper over the differences between OSGi extensions and the vanilla-Java-style ClassLoader#getResources mechanism. The type of the service you provide lines up with the META-INF/services/some.extension.type files you can use in your NSF and which still remain inside the embedded jars in the core XPages plugins.

The reason why this OSGi extension point exists is that OSGi intentionally creates separations between the individual plugins that make up your app runtime. In a "normal" web app, all of your dependency jars end up in the WEB-INF/lib sub-directory and are effectively all poured together to make a single class-loading environment. The ClassLoader#getResources route will look through all of the jars in the classpath for these META-INF/services files, but OSGi puts walls between them, and instead provides its own extension mechanism (among others, but this is the one Domino uses).

Dependency Resolution

Both Domino and Designer view the NSF like an OSGi plugin, but go about resolving the dependencies slightly differently. Fortunately, this is a case where the differences seldom crop up in practice - I've only seen some minor differences in how they honor the Export-Package directive in the bundle manifest and how fragment bundles are included.

When Designer is building an XPages app, it references the xsp.properties file to determine which XSP Libraries to include, and then uses their getPluginId() method to determine which OSGi bundle that matches up to (I think). It adds that plugin to the list of dependencies in plugin.xml and (since 9.0.1 FP10) META-INF/MANIFEST.MF. The Eclipse side of Designer then uses that to compose the Plug-in Dependencies list from those bundle IDs and any of their dependencies that are marked as re-exported. I think that Domino only cares about the generated plugin.xml/MANIFEST.MF files - I don't think that it does the resolution based on the library class, though I might be wrong about that.

ODPCompiler's Version

Currently, the ODP Compiler hews closer to the "Domino-style" route. For resolving the active class path, it trusts that the plugin.xml that exists in the ODP is correct and resolves dependencies from there. In the future, it may make sense to have the compiler generate the plugin.xml file itself, in which case it will also have to resolve the plugins based on the library classes. That wouldn't be too difficult, but for now it relies on the exported ODP.

Layer Cake

Looking at the whole XPages architecture, something that strikes me is how much it's simultaneously a giant stack of parts - config parsers, resolvers, runtime bootstrappers, and so forth - but also a pretty straightforward server-side web stack from a Java EE perspective. I've been diving deep into XPages in various ways for a long time now - building complex apps, writing library plugins, and even yanking the runtime out of Domino - yet writing the compiler led to this whole distinct set of capabilities. But a lot of this is essentially "just" ahead-of-time work, with Designer and the ODP Compiler's jobs being a lot of world-resolution followed by placing compiled pieces into the right places in the NSF.

By the time it gets to the NSF, it actually ends up as a pretty normal-style web app - XPages are just Java classes floating around, non-OSGi dependency jars are in WEB-INF/lib, and the faces-config.xml controls rendering in the same way as in JSF. A lot of that, though, will come up in later posts, where I go into the gotchas involved in taking these compilation results and other ODP resources and actually getting them into an NSF.

How the ODP Compiler Works, Part 1

Jun 30, 2019 1:54 PM

Tags: nsfodp xpages
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

A year ago, I started a project to compile NSFs - and particularly large XPages projects - independently of Designer. Since the initial releases, the project has grown, gaining an ODP exporter, extra Eclipse UI integration, and the ability to run without installing components on a remote server. It's become an integral part of my workflow with several projects, where I include the NSFs as part of a large Maven build alongside OSGi plugins and a final distribution.

Building this tooling required learning a lot about the internals of XPages, the specifics of how various design elements are stored and handled in an NSF, and miscellaneous bits about Equinox and Maven. Since there's a good amount of arcane knowledge embedded in the project, I think it'll be helpful to take some time to dive deep into what's going on, starting with XPages.

XSP to Java to Bytecode

The first challenge for me to overcome was how to go from XPages XML source to the Java class files (like those seen in the "Local" source folder in Designer) and finally to compiled Java bytecode. Much like in Designer's process, the middle part is just incidental: only the XSP source and the bytecode are actually stored in the NSF.

The official XSP -> Java compiler exists only in Designer, and so one route would be to try to get those plugins working on Domino. I think that'd work, but it'd be a huge hassle and, fortunately, an unnecessary one. The XPages Bazaar project contains essentially a clean re-implementation of the glue code required to coax the runtime into emitting what I needed. I used the Bazaar as an incubator for the early versions of the compiler, and tweaked the core code with additions and fixes to work with various needs I ran into.

OSGi Bundles

On its own, the Bazaar did the job well of taking an XPage and compiling it with whatever the surrounding environment had. However, to compile a full XPages app as part of a Maven build, I'd need the ability to dynamically load XPages libraries and dependencies on the fly.

To do this, I added the option to include an update site directory, and then I have the compiler stream through all the plugins and initialize them. Fortunately, the OSGi environment makes this pretty easy. The BundleContext object that's available from each OSGi bundle has an installBundle method you can use by pointing at a bundle file URL, and then you can call start() on that result to actually initialize it. I had to do a little extra work to account for bundles that shouldn't be started (source-only bundles and "fragments", which are additions onto normal plugins) and the like, but it's not too complicated.

XSP Libraries

Just installing the OSGi plugins isn't enough to get the XPages runtime to know about any libraries that may be included, however. This is done by finding all of the library extension contributors, sorting them for compatibility, wrapping them in a com.ibm.xsp.library.LibraryWrapper for some reason, wrapping that in a com.ibm.xsp.registry.config.SimpleRegistryProvider, and then adding the results of that to an in-memory com.ibm.xsp.registry.FacesSharableRegistry instance.

This is one of those cases where the actual code involved in the end isn't terribly long, but the amount of delving into the framework to figure out the needed parts was immense. In essence, each XPages application is represented as a FacesProject implementation object, which retains a registry of the libraries it knows about. In a normal running server, this happens automatically during initialization: the runtime opens the NSF, figures out which libraries it needs, finds them from the ones it knows about, and constructs the app's contained world. For the compiler, I ended up having to do something of an ad-hoc version of this as it goes, finding all the parts that need to be kicked to notice the available libraries and have them around to map something like <abc:someCustomComponent/> to an instance of com.abc.xsp.CustomComponent.

Custom Controls

Custom controls are a similar story, but they use a specialized variant of the "real" library system called LibraryFragment. The way these work, before the actual XSP -> Java compilation step, the compiler reads their definitions and adds them to the in-memory FacesProject. It's important to define them all in this way before actually trying to interpret their source (or any XPages), because this allows for the interpreter to understand a reference to one CC from another. Fortunately, the process is separated enough that the definitions can all happen before any of the Java classes actually exist. Otherwise, I would have had to either try to make a dependency graph of which CC references which other, or just keep trying to compile them repeatedly until it got the right order by brute force. The latter process will actually show up in a future blog post.

Java Source Files

Compiling individual Java source files - both those that show up in the Code/Java part of an NSF (or a custom source folder) as well as the translated XSP source - is handled by the Bazaar, in a class called JavaSourceClassLoader. This class wraps around the in-memory Java compilation capabilities that were added to the JDK in Java 1.6. In particular, this class, paired with SourceFileManager, provides knowledge to resolve classes from OSGi bundles in the active environment, including specialized knowledge of dependency management based on re-exported dependencies, embedded jars, and other OSGi-isms that play a big part in XPages libraries.

In essence, these classes provide a similar compilation environment to what Designer does when it creates the Eclipse project for compiling an NSF. They build up an environment with knowledge of all the dependencies that, in Designer, show up in "Plug-in Dependencies", and then the compiler feeds it all of the Java source files in the project. With all of this environment provided, the underlying Java compiler is able to do its job of converting them to bytecode en masse, which is then passed back to the compiler for insertion into the NSF.

Other Big Components

In the next blogs posts in this series, I'll go over some of the other big hurdles I had to overcome to get everything working properly: namely, figuring out all of the specialized behavior necessary to flag imported/created design notes properly and then the arcane incantations necessary to get this OSGi environment working outside of the Domino server.

Seeing the Familiar in SwiftUI and Combine

Jun 11, 2019 2:17 PM

Tags: ios

Apple announced quite a bit at WWDC last week, and some of the stealth favorites for programmers have turned out to be SwiftUI and Combine. My macOS and iOS development is incidental at most, but I like to pay attention to this stuff, and I think that these frameworks in particular are notable for how they reflect attributes of other languages and frameworks that I do use.

SwiftUI

Before beginning, I should point out that the best source of information about SwiftUI at the moment is Apple's WWDC video archive, in particular "Introducing SwiftUI" and "SwiftUI Essentials".

SwiftUI is generally summed up as a "declarative UI framework", with "declarative" here primarily setting it apart from imperatively-defined UI. Truth be told, a mix of both has been common in both Cocoa development and elsewhere for a very long time - the definition of UI layout in .nib files comes from the old NeXT days, for example. The things that set this apart are some technical improvements and a switch of focus from your program creating and managing a UI to instead your program defining what it wants the UI to be and letting the framework handle it.

Declaring the UI

The best way to think of the way this works is that, instead of actively calling methods and setting properties on a window, some buttons, etc. on the screen, what your code does in SwiftUI is instead create a work order for what it wants the current state of the UI to be and hands that to the framework to figure out what needs to happen to get there. That "what needs to happen" can range from initially creating the layout to figuring the right way to find the difference between the new states, removing old elements, adding new ones, and animating transitions between them.

The most immediate analogue for this in common use today is React, which has essentially the same model. In (presumably) both SwiftUI and React, your job as a programmer is to have a method on a component object that is called frequently to emit what it thinks its contents should be right now. So if you have, for example, an array of objects that's displayed as an unordered list, you'll have a render() method that looks like:

render() {
    return (
      <ul>
        {this.props.items.map(item => (
          <li key={item.id}>{item.text}</li>
        ))}
      </ul>
    );
  }

In SwiftUI, a cut-down version would look like:

var body: some View {
  List(model.items) {
    Text(item.text)
  }
}

And, as I mentioned, the core concepts aren't new. In core XPages, we'd write something very similar:

<xp:repeat value="#{items}" var="item">
	<xp:this.facets>
    	<xp:text xp:key="header" contentType="HTML" value="&lt;ul&gt;"/>
    	<xp:text xp:key="footer" contentType="HTML" value="&lt;/ul&gt;"/>
  </xp:this.facets>
  
  <li><xp:text value="#{item.text}"/></li>
</xp:repeat>

However, the last one differs in a number of ways, not the least of which is that the UI-generation part doesn't retain a concept of shifting state. You, the programmer, may know that the items list may grow or shrink by a value, but, in the XPages model, the browser just starts with one block of HTML and replaces it with another on a refresh. Internally on both server and browser, I'm sure there's some retained memory for efficiency's sake, but a removed list item won't, for example, know to gracefully fade out and let the remaining ones shift into its place.

SwiftUI drives this distinction home by preferring UI component definitions to be "structs". Java doesn't currently have an analogue to structs, but they come from C and they're effectively a "pure data" type. The distinction between classes and structs gets a little murky in Swift, but the core idea is that structs are meant to show pure data in a given state and are copied when passed around. That relates here because your SwiftUI component's job is not to be the UI, but rather to emit in-memory blueprints for what it wants given the current state of the data. When data changes, the framework asks the component for a new representation, compares the old one and new one in memory, and then makes changes to the UI representation as appropriate.

It's kind of an odd distinction to write out, but I found that there was a point when I was learning React when it "clicked" in my head.

Separation of UI Definition and Output

Beyond the simplicity of declaring the UI, one of the big things that the SwiftUI presentations drive home is how one UI definition can be used across all of Apple's platforms. A list is a list is a list, and the fact that it looks one way on a watch and another way on a TV isn't inherently important.

Seeing this made me feel pretty good, since it reminded me of a post a wrote years ago that dealt with the component/renderer split in XPages at a conceptual level. Though renderers in XPages don't go so far as spitting out an AppKit application at runtime, the concept of an abstractly-defined UI that is then rendered differently based on the target is very important.

What remains to be seen with SwiftUI is how clean this distinction can remain. In my XPage example, I used some of the ExtLib's semantic form components, but real XPage applications are almost always mucked up with inline CSS classes and styles, client and server JavaScript, meaningless HTML tags like <div>, and so forth. It's one thing to say "oh, this just means a checkbox on macOS and a toggle on iOS", but another to handle a complex, crafted user interface that may be radically different in different contexts.

DSLs

SwiftUI is written in the full Swift language, but it's best described as a domain-specific language written in Swift. The term "DSL" originally meant a whole language dedicated to a small task, but nowadays usually refers instead to a style of writing an API that, when paired with a general-purpose language, acts like it's a language dedicated to the task. This is usually a feature of "scripting"-type languages, where the syntax is clean and flexible enough to not interfere.

In the Java world, the go-to language for this has long been Groovy. Darwino uses Groovy for the database adapter DSL, and SmartNSF does as well for defining services. The most common use nowadays is probably Gradle, the Maven-competing build system. It uses Groovy's closures and parentheses-less method calls to make a build script that looks more like a configuration file than a script:

plugins {
    id 'java'
    id 'application'
}

repositories {
    jcenter() 
}

dependencies {
    implementation 'com.google.guava:guava:26.0-jre' 

    testImplementation 'junit:junit:4.12' 
}

mainClassName = 'demo.App' 

The key reason why these DSLs are useful (other than not having to write a new parser) is that you have the full abilities of the underlying language at your disposal. In SwiftUI, you can do your "hide-when" logic by using the normal old if structure, rather than having a specialized "when should this show up?" property. That also means that you can bring in whatever other logic you have, third-party libraries, and so forth, without having to have specialized support in the API.

Combine

Combine, in addition to having an ominous name, is a less-flashy addition than SwiftUI, but is nonetheless important and also shows the integration of growing themes in programming elsewhere, specifically reactive programming. It's one of those topics where you can very easily fall off a conceptual cliff, and I found even Apple's introductory session to make it sound more daunting than it is by focusing on the specific Swift protocols in practice.

I've found that the best way to learn something like this is to set aside the "asynchronous" aspect at first to focus on the "data flow" part. For this, we're in luck, since this is what Java 8 streams are. Streams are all about starting with some source of data - often just a List implementation, but it's intentionally arbitrary - and changing, filtering, sorting, and otherwise manipulating it to get a result. So a prototypical example from Java can be:

Stream.of(10, 1, -3, 5)
	.filter(i -> i > 0)                  // [10, 1, 5]
	.map(i -> i * 2)                     // [20, 2, 10]
	.sorted()                            // [2, 10, 20]
	.map(String::valueOf)                // ["2", "10", "20"]
	.collect(Collectors.joining(", "));  // "2, 10, 20"

Most of the time in practice, the actual implementation will pretty much match the sequential reading of this path and so will be roughly similar to if you wrote it out with for loops, if statements, and so forth. However, because your code is describing what you want done rather than how to do it, there's room here for short-circuits, multithreading, and optimizations for different collection types.

If you look at a snippet of an example from Combine, you can see near-identical syntax in Swift, with the same idiomatic indentation:

let p4 = p1
	.merge(with: p2)
	.append(5) // add 5 to the end of the sequence
	.allSatisfy { $0 >= 1 } // check if all values are bigger than 0
	.count() // how many values: 1

Importantly, the same style of syntax applies when the incoming data is asynchronous and when the final output is similarly out-of-band. This is what all of Apple's Combine examples start with, which is why I think it can seem a bit daunting. But I think the only real switch to make in your head is to slot in the term "Publisher" for the starting provider of your data (originally an array here, but it could be a keyboard or network resource) and "Subscriber" for the code that deals with it.

Admittedly, things can get more complicated there when you need to add in error handling, value binding, and other practical considerations, but the simplicity of the core concept remains.

Overall

For Domino needs specifically and web development generally, SwiftUI and Combine don't themselves matter much. Still, I think it's useful to take times like this to see larger trends and, when you can, bask in the satisfaction of having seen the concepts elsewhere.

My Slides From Engage 2019 - De04. Java With Domino After XPages

May 20, 2019 9:24 AM

Tags: xpages engage

Engage 2019 has come and gone, and I had an excellent time. I also quite enjoyed presenting my "group therapy" session on some options that XPages developers have for the future. In a lot of ways, it was similar to my presentation at CollabSphere last year, mixed with the various new developments I've talked about on here since then:

Engage 2019 - De04. Java with Domino After XPages

Engage 2019

May 6, 2019 11:10 AM

Tags: engage

Engage is just around the corner in Brussels, and I'll be joining everyone there, presentation in hand. Specifically:

Dev04. Java With Domino After XPages
Tuesday, May 14 at 16:00 in room E. Mahy

XPages guided Domino web development out of a world of archaic proprietary hacks and into the realm of Java server development and something approximating Java EE. Now that the XPages framework is moribund, the question is: what's next? If you've built up Java skills over the years, you have a direct path to use them in the new modern world, whether via OSGi plugins and new XPages extensions on Domino or standalone Java web projects. This session will discuss the lessons we learned from XPages, how they correspond to newer technologies, and how to bring your existing apps and plugins forward.

If you attended my session CollabSphere last year (by the way, you should go to CollabSphere this year too), you may note that the title is pretty similar to that one, albeit changing "In Domino" to "With Domino". This massive change in preposition reflects how things have only gotten more awkward for us in the intervening year.

So, if you're attending Engage, join me as we work through our sorrows together!

 

Java Grab Bag 2

May 3, 2019 3:37 PM

Tags: java
  1. That Java Thing, Part 1: The Java Problem in the Community
  2. That Java Thing, Part 2: Intro to OSGi
  3. That Java Thing, Part 3: Eclipse Prep
  4. That Java Thing, Part 4: Creating the Plugin
  5. That Java Thing, Part 5: Expanding the Plugin
  6. That Java Thing, Part 6: Creating the Feature and Update Site
  7. That Java Thing, Part 7: Adding a Managed Bean to the Plugin
  8. That Java Thing, Part 8: Source Bundles
  9. That Java Thing, Part 9: Expanding the Plugin - Jars
  10. That Java Thing, Part 10: Expanding the Plugin - Serving Resources
  11. That Java Thing, Interlude: Effective Java
  12. That Java Thing, Part 11: Diagnostics
  13. That Java Thing, Part 12: Expanding the Plugin - JAX-RS
  14. That Java Thing, Part 13: Introduction to Maven
  15. That Java Thing, Part 14: Maven Environment Setup
  16. That Java Thing, Part 15: Converting the Projects
  17. That Java Thing, Part 16: Maven Fallout
  18. That Java Thing, Part 17: My Current XPages Plug-in Dev Environment
  19. Java Hiccups
  20. Bitwise Operators
  21. Java Grab Bag 2

Following in the vein of "Java Hiccups", I've had a couple things floating around my head lately that I think collectively make for a good post for Java developers, particularly those working in the Domino arena.

Without further ado:

Map#computeIfAbsent

This is a method that was added in Java 8 and, while it's not as big a deal as the addition of streams, it's one of my favorite additions and something I use very frequently. To give a point of reference, consider this common idiom from an imagined XPages app:

Map<String, Object> applicationScope = ExtLibUtil.getApplicationScope();
if(!applicationScope.containsKey("someVal")) {
  applicationScope.put("someVal", someExpensiveOperation());
}
String someVal = (String)applicationScope.get("someVal");

Essentially, using a Map as a cache for a complex computed value. Java 8 added the #computeIfAbsent method (alongside several similar ones) to do this in one go:

String someVal = (String)ExtLibUtil.getApplicationScope().computeIfAbsent("someVal", key -> someExpensiveOperation());

The second parameter is (usually) a lambda, like the ones used in streams, that takes the provided key as an argument and is only executed if the value does not already exist. Due to the way this was added, most implementations do pretty much the same thing as the first block of code, but you don't have to care about that. Your code gets a bit smaller, the intent is much clearer, and it's less prone to small bugs like changing the key and forgetting to change it in all three places.

Arrays Are Weird

Java's built-in array type is loosely based on C's, and that's reflected in the syntax:

int[] foo = new int[4];
foo[0] = 1;
foo[1] = 2;
foo[2] = 3;
foo[3] = 4;

Like C, they are zero-based, declared with the capacity and not the max index, and cannot be resized. Unlike C, arrays aren't just syntactical sugar on top of pointers, and this manifests immediately in bounds checking. Take this line:

foo[4] = 10;

In C, this will (famously) just write an integer 10 value into whatever memory happens to be just beyond the bounds of your array. In Java, you'll get an ArrayIndexOutOfBoundsException, saving you from the insidious bug. But, since Java arrays are (probably) implemented internally very similar to in C - they're likely contiguous blocks of memory sized to the type - they're still extremely efficient, and so they show up in a lot of speed-critical code.

As speedy and safe as they are, though, they're still pretty unfriendly. For starters, they can't be resized. When you do new int[4] (or use literal syntax like new int[] { 1, 2, 3, 4 }), you carve out that much memory and can't shrink or expand it in-place. You can change the values inside an array, just not its size. To "resize" efficiently, you have to make a new array and then use System.arraycopy to populate the new array with the contents of the old.

This is all why the List interface (with its predecessor class Vector) exists: they serve the same function of "ordered collection of stuff", but allow for dynamic resizing. Because these objects are usually "efficient enough" (ArrayList uses "true" arrays under the covers) while having numerous additional benefits, you should use them as your first go-to and only use arrays if you have a reason.

That's in part because the weirdness of arrays doesn't end with their inconvenience. Array types are actually implicitly-created classes, even when they contain primitive types. So:

int foo = 3; // primitive value
foo = null; // syntax error!
int[] bar = new int[] { 3 }; // Object
bar = null; // legal!

List<int> fooList = new ArrayList<>(); // syntax error - primitives can't be in generics
List<int[]> barList = new ArrayList<>(); // legal!

Object.class == Object[].class; // false
int[].class.isArray(); // not only legal, but true

Java provides two main utility classes for working with arrays: java.util.Arrays (extremely useful for Arrays.asList) and java.lang.reflect.Array (usually only useful in edge cases).

Java Has No Library Versioning System

If you've worked with Java in Designer or Eclipse, you've likely run across this preferences pane or its per-project version:

Eclipse Java compiler settings

These settings affect two things:

  • The syntax allowed in your source (e.g. new ArrayList<>() requires 1.7 or higher)
  • The class file format version (you can think of this like an NSF ODS). You've likely seen the latter in play by receiving an UnsupportedClassVersionError trying to run Java 7 or 8 code on a pre-9.0.1FP8 Domino server.

Conspicuously absent from this short list is anything to do with classes or methods added to the runtime in newer versions. For example, the String class gained a static method String.join to conveniently concatenate strings with a given delimiter. If you're targeting an older Java version but using a newer Java library (as Designer 9.0.1FP10+ does with a default target of 1.5 and JVM of 1.8), you can write a line of code using that method without issue - the syntax doesn't require anything above 1.5, so all is clear as far as the compiler is concerned. But if you then try to run that code on an older JVM (such as an older Domino server), you'll get an exception at runtime, since the method doesn't exist.

Unfortunately, the only true answer to this is to tell your IDE about a JRE for each specific Java version you're targeting, something that doesn't happen by default, and which will gradually get more difficult as Java 6 becomes harder and harder to come across.

This is one of the things that OSGi aims to fix - you could, for example, have many versions of Guava installed, and you could declare that your plugin works specifically with version 18. Then, when loading, the runtime will either bind to a matching version or give you an error that no version could be resolved. No mystery involved. Unfortunately, OSGi is a niche thing losing ground, and the module system introduced in Java 9 consciously does not address this.

In a pinch, you can use the file tool on most Unix systems to check the version of an individual class file:

$ file Foo.class
Foo.class: compiled Java class data, version 52.0 (Java 1.8)

Licensing

A little while ago, Oracle raised a bit of a stink by declaring that, as of this year, commercial use of their Java runtimes would require paid licensing. Historically, you could get support for Java for money, and certain additional components had their own licensing requirements, but it was pretty normal otherwise to install Java from java.sun.com and not give it a second thought.

This naturally caused a few questions when it comes to Domino and other Java-incorporating IBM products, and IBM released a statement that basically amounted to "you don't have to worry about it". IBM has maintained their own variant of the JVM (called J9 for Smalltalk-related reasons, not to be confused with Java 9) and anyway has always had arrangements with Sun/Oracle such that IBM's customers don't have to worry about dealing with Oracle directly.

But what about using Java outside of a licensed product, such as if you just want to run Tomcat on some server? The short answer there is that you're still fine, but you just have to know a little about the difference between "a JDK" and "Oracle's JDK". Java and the surrounding JDK have been progressively open-sourced in fits and spurts over the years, and are now at a point where the project called OpenJDK is basically the real Java environment, and then Oracle's JDK is just one implementation of it. It's similar to Linux: the core parts are open-source, and many distributions are entirely free, but there also exist commercial variants for pay.

So, if you want to run a Java stack, you can do so without putting forth a single cent by using an OpenJDK build. Oracle, IBM, and others will still be happy to take your money if you want a commercially-supported Java environment, of course.

For a longer explanation, this blog post from @javachampions is pretty much the definitive word.

 

Bitwise Operators

Apr 26, 2019 4:11 PM

Tags: javaee
  1. That Java Thing, Part 1: The Java Problem in the Community
  2. That Java Thing, Part 2: Intro to OSGi
  3. That Java Thing, Part 3: Eclipse Prep
  4. That Java Thing, Part 4: Creating the Plugin
  5. That Java Thing, Part 5: Expanding the Plugin
  6. That Java Thing, Part 6: Creating the Feature and Update Site
  7. That Java Thing, Part 7: Adding a Managed Bean to the Plugin
  8. That Java Thing, Part 8: Source Bundles
  9. That Java Thing, Part 9: Expanding the Plugin - Jars
  10. That Java Thing, Part 10: Expanding the Plugin - Serving Resources
  11. That Java Thing, Interlude: Effective Java
  12. That Java Thing, Part 11: Diagnostics
  13. That Java Thing, Part 12: Expanding the Plugin - JAX-RS
  14. That Java Thing, Part 13: Introduction to Maven
  15. That Java Thing, Part 14: Maven Environment Setup
  16. That Java Thing, Part 15: Converting the Projects
  17. That Java Thing, Part 16: Maven Fallout
  18. That Java Thing, Part 17: My Current XPages Plug-in Dev Environment
  19. Java Hiccups
  20. Bitwise Operators
  21. Java Grab Bag 2

In the "Java Hiccups" post, I briefly mentioned this little tidbit:

(short)Integer.MAX_VALUE == -1   // true

This fact betrays a lot about how numbers are stored internally in most languages. Some of those details, like the fact that the specific method is called two's complement, are fully in the range of "computer science" stuff - but it can be handy to know about a couple ways you can put this sort of thing to use.

For our purposes, let's work with the short data type in Java, because it's painfully important to Notes. "Short" is called such because it's smaller than a standard integer type in a given language. In some languages, like C, the length of core data types like this is sometimes tied to the bitness of the processor, but in Java they're consistent across everything. Specifically, short in Java is 16 bits long. Let's take the number 21,038, which is represented in binary as:

0101 0010 0010 1110

That's two bytes, and each byte is broken up into two groups of four bits each because it turns out to be helpful visually and because each "nibble" there can be matched to a single hexadecimal character (522E in this case). This is the most common way you'll see binary written when you start getting into this sort of thing.

So I mentioned casting in the previous post, and what casting does with primitive integer types like this is to chop off the bits that don't fit. If you cast the above value into a byte, it'll chop off all but the ending 8 bits, giving you:

0010 1110

Which is 46. If you cast it to a larger type, such as int (32 bits in Java), it just adds some zeros to the front:

0000 0000 0000 0000 0101 0010 0010 1110

Which is still 21,038, but now it takes up twice as much memory.

For our uses, this stuff matters immediately in two ways: it represents the limits of some data storage and it allows us to manipulate numbers with bitwise operators.

Storage Limits

If you max out the bits in a 16-bit integer, you get 1111 1111 1111 1111, and this value is either -1 or 65,535 (64k) depending on whether or not the value is considered "signed" by the code using it. When it's "unsigned", a 16-bit integer can represent between 0 and 65,535; when it's "signed", its range is -32,768 to 32,767 (32k). Java only has "signed" types, which is why Short.MAX_VALUE is 32k and not 64k. C, on the other hand, lets you choose in your code which type you want to use.

The Notes C API uses a couple type names to represent consistent sizes across different platforms, and the one that's important here is WORD: a 16-bit unsigned integer. This is used in, for example, the function used to set the value of a text item in a note:

NSFItemSetText(
  NOTEHANDLE  hNote,
  const char far *ItemName,
  const char far *ItemText,
  WORD  TextLength)

That WORD at the end there is where you tell the API how much text you're providing, and is thus capped at 64k - and is why you can store 60k of text in a non-summary text field but not 70k. As to why summary data is limited to 32k and not 64k, I'm guessing that either there's a signed value in there somewhere or that last bit was needed for something else.

If you look over the list of Notes limits, you can see this reflected all over the place, along with a few 8-bit (0-255) limits for good measure.

Bitwise Operators (for real this time)

The term "bitwise" refers to a handful of operators that deal directly on the bits of a number. Java takes a cue from C here and provides &, |, ^, ~, <<, >>, and >>>. These all have their uses, primarily for really-low-level stuff and efficiency, but we mainly care about & and |. These two may have bitten you in the past because of their similarity to && and || - and in particular because Formula Language uses the single-character versions to mean what Java means by the double-character ones. They also kind of work the same way, but aren't really the same.

To see how they work, let's take our original starting number, 21,038, and pair it up with another value, 5,000:

0101 0010 0010 1110
0001 0011 1000 1000

Visualizing the numbers in binary and stacked like this comes in extremely handy when it comes to dealing with bitwise operators. We'll start with &, or "bitwise AND". What this operator does is take two numbers and return a number where the matching slots in each one both contain a 1. So, in this case, it results in this internal math:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0001 0010 0000 1000

Which corresponds to 4,616.

The | operator, or "bitwise OR", will provide a result where either of the original numbers has a 1 in the slot. In this case:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0101 0011 1010 1110

Or 21,422.

It's pretty rare that you want these operators for the actual numerical values, though. What they're pretty much always used for are "bit fields", an extremely-efficient way to store a set of related flags. Going back to the Notes API, the NSFItemAppend function (a generic way to add an item value) has a parameter WORD Flags, that matches up to a number of properties that you can assign to an item. In the C source, they're written as hexadecimal values, but I've added in the binary versions here:

#define	ITEM_SIGN          0x0001   // 0000 0000 0000 0001
#define	ITEM_SEAL          0x0002   // 0000 0000 0000 0010
#define	ITEM_SUMMARY       0x0004   // 0000 0000 0000 0100
#define	ITEM_READWRITERS   0x0020   // 0000 0000 0010 0000
#define	ITEM_NAMES         0x0040   // 0000 0000 0100 0000
#define	ITEM_PLACEHOLDER   0x0100   // 0000 0001 0000 0000
#define	ITEM_PROTECTED     0x0200   // 0000 0010 0000 0000
#define	ITEM_READERS       0x0400   // 0000 0100 0000 0000
#define ITEM_UNCHANGED     0x1000   // 0001 0000 0000 0000

So you can see that each flag there has a distinct slot filled in that each other doesn't (you can also see, like missing teeth, the bits that are probably obsolete or undocumented features). If you want to create an item that is a signed, sealed, summary, readers item, you'd do this:

WORD flags = ITEM_SIGN | ITEM_SEAL | ITEM_SUMMARY | ITEM_READERS

…which will result in this bit of internal math:

0000 0000 0000 0001
0000 0000 0000 0010
0000 0000 0000 0100
0000 0100 0000 0000
===================
0000 0100 0000 0111

This is extremely efficient, both because you can store a bunch of information in just 16 bits, but also because working with these bit fields is baked into processors at the lowest level. There's not much you can do on a computer that's faster, in fact.

When you have a value like this, you can query it for whether or not a given value is set by using the AND operator:

flags & ITEM_SUMMARY

This will result in:

0000 0000 0000 0100

…which is the same as the value of ITEM_SUMMARY itself, since that one slot is the only bit in common. Conversely, if you check:

flags & ITEM_PROTECTED

…you'll end up with zero, since the flag doesn't match.

Use in Java

Most of the time, you don't have to think too much about the bits that make up numbers in Java, but they creep up from time to time. Generally, the main situations where they arise are when you're dealing with a low-level API or with code written by someone who spent a lot of time with C and deeply internalized its efficiencies.

For an example of the former, take a look at the com.ibm.designer.domino.napi.NotesNoteItem class in IBM's NAPI. It contained a method called int getFlags(), which returns exactly the value we were talking about above. If you get your hands on a summary item this way, you can do this to tell if it's a summary item:

(item.getFlags() & 0x0004) != 0

Note the extra != 0. While the operation involved is the same as in C, C lets you treat any non-zero integer value as a boolean true, while Java forces you to perform an actual boolean operation.

For an example of the latter, turn your eyes to the DefaultColumnDef class used in xe:dynamicViewPanel generation and customization, a snippet of which is:

public static class DefaultColumnDef implements ColumnDef {
      public static final int FLAG_HIDDEN         = 0x000001;
      public static final int FLAG_LINK           = 0x000002;
      public static final int FLAG_ONCLICK        = 0x000004;

      public int flags;
  
      public boolean isHidden() {
          return (flags&FLAG_HIDDEN)!=0;
      }
      public boolean isLink() {
          return (flags&FLAG_LINK)!=0;
      }
      public boolean isOnClick() {
          return (flags&FLAG_ONCLICK)!=0;
      }
}

This could also be represented as a bunch of boolean properties on the object or, most idiomatically, an EnumSet, but using a bitmask is slightly more space- and CPU-cycle-efficient. Slightly. The difference isn't enough to even think about normally, but can become worth it in cases where the code is called so often that tiny efficiencies can make a difference.

This whole topic is the sort of thing that is normally so many layers of abstraction below where we work that it's basically invisible, but it's definitely good to at least know the basics for the times when it does come up.