Speaking at Social Connections Toronto

  • May 30, 2016

By virtue of one of the original speakers having to cancel, I will be presenting at Social Connections in Toronto next week! Specifically, it will be on the 7th at 2:30, and the topic will be OpenNTF's new tooling and initiatives:

OpenNTF has long been a central hub for community and open-source development in the Notes/Domino world. In the past year, however, it has also expanded its capabilities with new tooling and a broadening scope to the larger IBM portfolio. This presentation will discuss OpenNTF’s history and its plans for the future, including a look at its newly-launched suite of source control, issue tracking, and continuous-delivery tools.

So, if you're attending Social Connections, drop on by the Toronto Room on Tuesday afternoon.

Old App Idea: App Manager

  • May 24, 2016

Years back, I took a small whack at an idea that had been percolating in my head: a "app manager" application that would assist with the XPage-specific portions of running a Domino server. It would cover a lot of ground that Administrator really doesn't touch, such as inspecting NSFs to see which contain XPage artifacts, highlighting potential problems with them, and assisting with app-design backup and deployment.

It never really got too far, since there are always a hundred other things to work on, but I always liked the idea. I decided to toss what code I had laid down up on GitHub:

https://github.com/jesse-gallagher/app-manager

I don't imagine I'll really have time to build on that unless I'm suddenly really struck with inspiration. But either way, the code is there for anyone interested in taking a look.

The Cleansing Flame of Null Analysis

  • May 21, 2016

Though most of my work lately has been on sprawling, platform-level stuff or other large existing codebases, part of it has involved a new small app. I decided to take this opportunity to dive more aggressively than previously into automated null analysis and other potential-bugs tools.

What I mean by "null analysis" is letting the IDE or compiler try to help you avoid NullPointerExceptions. Though there are plenty of other programming mistakes you could still make, these are among the most common, and so a little extra work up front to avoid them should pay dividends. Eclipse has some handy options in its Java → Compiler → Errors/Warnings preferences to assist with this:

The first option will pick up on some pretty basic instances, such as:

Object foo = null;
System.out.println(foo.hashCode());

Since this is clearly going to always cause an NPE, Eclipse is able to point this out as an error. The next level gets a little more nebulous: "potential" null pointer access. This crops up when Eclipse can't reliably determine whether a value will be null, either because there is no way to know at compile time (say, database access) or because the compiler's tooling is too limited. Here's a contrived example:

Object foo = Math.random() > 0.5 ? new Object() : null;
System.out.println(foo.hashCode());

This situation is clearly untenable, but there are other situations where you as a programmer can be very confident that the value will not be null (say, if you swap out the > 0.5 for >= 0.0), but the compiler doesn't know that. That's why it often makes sense to leave that as a warning instead of an error.

That's all stuff I've done before, but now I've decided to dive into annotation-based null analysis as well. Unfortunately, in stock Java, this is something of a hot mess (that list even leaves out Eclipse's home-grown version). Since Java didn't grow up with this sort of capability, it's been shoehorned in by various parties over the years. There are other tools to assist you in Java 8, but, unfortunately, I can only target 7 as the highest. For now, I've settled on the "sort-of standard" javax.validation.constraints package. It wasn't really intended for this specific purpose, but it's flexible enough to suit and can be used in Eclipse and FindBugs (though I have my reservations about the choice).

In Eclipse, this type of analysis can be enabled by checking "Enable annotation-based null analysis" below the other options and, unless you're using Eclipse's known annotations, adjusting the "Configure" options next to "Use default annotations for null specifications":

In any event, regardless of the choice of tooling, the "this shouldn't be null" annotations work the same way: you use them to decorate things that you either require not be null when provided to you (method parameters) or you promise to not be null when providing to others (method return values). For example:

public @NotNull Object doSomething(@NotNull Object otherObject) {
	return otherObject.toString();
}

This highlights three things, two good and one bad:

  • Good: The @NotNull in the method parameter means that, as long as the calling code is also checked for null use, the method can be confident that there won't be a NullPointerException when calling a method on otherObject.
  • Good: The @NotNull on the return value means that other code calling this method can be confident that they will not get a null value from it, and so can skip extra null checks.
  • Bad: Eclipse flags otherObject.toString() as a potential problem because it doesn't know for sure that Object#toString doesn't return null, because it has no nullability annotations. As programmers (or as a compiled-code analysis tool), we can be fairly confident that it will be non-null because any object returning null for that is essentially broken on its own.

That last one is a common problem when adopting annotation-based null analysis, at least in Eclipse (I hear it may be better in IntelliJ): its logic doesn't go very deep. If everything is gussied up with these annotations, you're clear - but as soon as you step outside of the project you're working on, you have to add in likely-unnecessary checks. Fortunately, these checks don't realistically hurt (a null check at runtime in a normal app is negligible performance-wise), but they can grate to have to add in.

Glutton for punishment that I am, I decided to go a step further and enable FindBugs processing as an integral step of my build. Though FindBugs can be very picky about the types of things it complains about, it is blessedly more thorough in its analysis than Eclipse, so you generally end up conceding that it is correct when it yells at you. Since the project is Maven-based, I added the check in the project's pom file:

<plugin>
	<groupId>org.codehaus.mojo</groupId>
	<artifactId>findbugs-maven-plugin</artifactId>
	<version>3.0.3</version>
	<configuration>
		<includeTests>true</includeTests>
	</configuration>
	<executions>
		<execution>
			<phase>compile</phase>
			<goals>
				<goal>check</goal>
			</goals>
		</execution>
		<execution>
			<id>findbugs-test-compile</id>
			<phase>test-compile</phase>
			<goals>
				<goal>check</goal>
			</goals>
		</execution>
	</executions>
</plugin>

For most uses, that's all that's required. Now, when the project is compiled, FindBugs will give it a once-over and halt the build if it finds anything it doesn't like. This can be tweaked a great deal - for example, changing the checks to run or the severity of the problem needed to fail the build - but the defaults will likely suit.

Adding these extra checks involves a lot of plusses and minuses. The big minus is that you may end up spending a lot of time "fixing" bugs that don't really exist, time that you could instead spend actually writing your application (and writing new bugs that the tools won't find anyway). There's really nothing to be gained by carefully explaining to Eclipse for the hundredth time that toString always returns non-null.

Still, particularly when tested out in a small, low-surface-area app, this can be a good practice to learn and refine. Eventually, a move to Java 8 will help this more, and it certainly doesn't hurt to add in nullability annotations in the mean time. Overall, I think having the tooling help you avoid a whole suite of common "brain fart" bugs like this is worthwhile.

Quick, Short-Notice Announcement: Delaware Valley Soc-Biz UG Meetup

  • May 17, 2016

Granted, this is short notice, but for anyone in the southeast-PA area, there's a meetup at IBM's offices this Thursday (the 19th) from 11 to 12:30. I'll be there, giving a presentation about OpenNTF and some of the ways that I've used the projects we work on there on my customer projects. Better still, there's lunch! The signup form is over on Greenhouse here:

https://greenhouse.lotus.com/forms/landing/org/app/80846239-6f7a-4483-8ace-9e5e02b0a661/launch/index.html?form=F_Form1

If you're able to make it, great! Otherwise, I imagine there should be more of this sort of thing in the future.

Darwino for Domino: Domino-side Configuration

  • May 16, 2016

In my last post, I mentioned that a big part of the job of the Darwino-Domino replicator is converting the data one way or another to better suit the likely programming model Darwino-side or to clean up old data. The way this is done is via a configuration database on the Domino side (an XPages application), which allows you to specify Database Adapters that configure the translation. While it is possible to write these in Java, the primary way is to use a Groovy-based DSL script.

The simplest form of this script may be one line just defining where the NSF is:

nsfName "foo.nsf"

With that, the replicator will open foo.nsf and attempt to replicate all documents with "best fit" translations of each field it comes across. Things can get a little more complex, though:

form("SomeForm") {
	excludeField "foo"
	excludeFields "IgnoreMe"
	
	field "foo_(.*)", nameRegex: true
	arrayField "bar_(.*)", delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: false, compact: true,
		userDataFormatName: "ODAChunk", nameRegex: true
	field ~"impliedfoo_(.*)"
}

form("ArrayTest") {
	arrayField "FirstName", type:TEXT, delimiter: "_"
	arrayField "LastName", type:TEXT, delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: true, compact: true
	
	// Similar to the ODA style
	arrayField "UserData", type:USERDATA, delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: false, compact: true,
		userDataFormatName: "ODAChunk",
		toDarwino: { value ->
			// The value will be an array of byte arrays, which should be strings
			println "testrep: called toDarwino!"
			value.collect { bytes ->
				new String((byte[])bytes, "UTF-8");
			}
		},
		toDomino: { value ->
			// We should provide an array of byte arrays
			println "testrep: called toDomino!"
			println "value is ${value}"
			def result = value.collect { string ->
				string.getBytes("UTF-8")
			}
			println "result is ${result}"
			return result
		}
}

form("RestrictedFields") {
	restrictToDefinedFields true
	field "KnownField"
	field "KnownField2"
}

The specifics of what's going on in this example (pulled from unit test data) aren't too important, but it demonstrates the customizability that an in-language DSL brings. Since the code is executed in Groovy, it has access to the full Java runtime (including, if you deliver it in a plugin, your own custom classes and dependencies), as well as Groovy's nice abilities like closures. In fact, a great many properties, like the converters above, can be specified as closures and executed in the context of each document as it's replicating, allowing for pretty fine-grained translation.

And personally, adapting Groovy into the process was an interesting exercise. Since Groovy was designed explicitly as a "scripting" variant of Java, the process of working it in to an existing Java code base is very smooth, and there aren't too many gotchas. I wrote some Java classes that provide the context for the root and individual "form" blocks, wired up the interpreter, and then they call each other seamlessly. Other languages could probably suit the job well too - JRuby, Rhino, etc. - but Groovy is mature, purpose-built, and largely syntax-compatible with Java itself, making it a very comfortable fit.

Darwino for Domino: Replication and Data Format

  • May 11, 2016

One of the key points of interest in Darwino for Domino developers is its two-way replication. Darwino's replication system was built in such a way that, in addition to its own internal needs, you can also write a replicator to connect to an entirely-unrelated system, as long as that replicator translates the foreign data to and from JSON documents. Domino is a perfect case for this, since the data model is already very similar, and its replicator ships with Darwino and has been a focus of my attention for a while.

Behind the scenes, this replication takes advantage of a lot of the same kinds of things that Domino's replication always has: UNIDs, sequence IDs, original-vs.-in-file modification/creation, and deletion stubs. These details are transparent to the developer: the Darwino adapter knows how to fetch the appropriate data from the NSF and to convince Domino that the Darwino DB is (almost) like a remote Domino server, at least as far as the stored data is concerned.

The primary differences show up in the way data is formatted for storage. Darwino uses JSON for its document format, which has a couple key advantages and disadvantages compared to NSF's "bag of items" approach. The best approach is probably to provide an example and highlight the pertinent differences. Say you have a Domino document that (conceptually) looks like this:

FirstName
	Type: TEXT
	Value: Foo
LastName
	Type: TEXT
	Value: Fooson
Username
	Type: TEXT
	Flags: NAMES READWRITERS
	Value: CN=Joe Schmoe/O=SomeOrg
Birthday
	Type: TIME
	Value: 1970/2/1
Vacations
	Type: TIME_RANGE
	Value: 2016/1/1-2016/1/5, 2016/3/3-2016/3/5
IsAdmin
	Type: TEXT
	Value: Y

On the Darwino side, that would potentially look like this:

{
	"_writers": {
		"username": ["cn=Joe Schmoe,o=SomeOrg"]
	}
	"firstname": "Foo",
	"lastname": "Fooson",
	"birthday": "1970-02-01",
	"vacations": [
		"2016-01-01/2016-01-05",
		"2016-03-03/2016-03-05"
	]
	"admin": true
}

There are certainly a few things to take note of here. First and foremost is the structure of the authors field. Because JSON doesn't have field metadata, the way Darwino does its readers/writers security is by using specially-named and -structured properties within the JSON, and so the converter moves all readers and writers fields into that. This has internal-implementation reasons, but I think it's also conceptually preferable to, say, having a multi-level object within each field to declare its flags separate from the value. The name also happens to be stored in LDAP style, because Darwino is more at home with standard LDAP conventions for that sort of thing.

Another thing to note is the format of the date fields. Since JSON doesn't have a real date/time type of its own, these values are converted according to ISO 8601 and stored as strings. That means that your Darwino application will need to know that those string values represent dates, but they're reasonable to deal with.

The multi-value date field leads to another important aspect: arrays. In Domino, most items are conceptually presented as arrays, regardless of whether they contain single or multiple values, leading to code that requires either explicitly asking for the first element or jumping through hoops to deal with single or multiple values. Since that's a drag to worry about with JSON, the default behavior for single-value items transferred to Darwino is to store them as "bare" values. When configuring the translation (which will be fodder for a future post), you are able to specify that you want a field to be always stored as an array, which will allow the Darwino-side code to be simpler.

The last field shows off an outright advantage of the JSON format: boolean storage. When defining the conversion, you can specify a field as boolean and provide what the true/false values will be, and they will be sent over to JSON as true and false explicitly. That's not a night-and-day change, but it is a nice help.

Finally, there's the matter of rich text, which is unsurprisingly a nontrivial problem. This is handled in an XPage-alike way: MIME rich text is transferred with only minor adjustments, while Composite Data is converted to HTML and cleaned up a bit before transfer. Darwino supports the concept of attachments natively, and so Domino attachments are brought over with a naming prefix to match the field they're attached to, plus a delimiter to indicate whether they're normal attachments or embedded images. The way it is presented on the user end is dependent on the application, but Darwino has some routines to translate storage-safe inline image refs to app-relative URLs.

Later, I will go into the process of how the Domino-Darwino adapters are set up. The short of it is that you create scripts that can run the gamut from just telling the server where to find the NSF to customizing the translation of each field encountered. This allows you to either transfer the data back and forth in the default "best approximation" approach or use the opportunity to enforce a bit of a schema on old data.