Carving Out A Workspace On Apple Silicon

Feb 17, 2021, 11:24 AM

Last month, I mentioned my particular computer trouble, in that my trusty iMac Pro has been afflicted by an ever-worsening fan noise problem. I'd just been toughing it out, since there's never a good time to lose your main machine for a week or two, and my traveler MacBook Escape wasn't up to the task of being a full replacement.

After about a month's delay, my fresh new M1 MacBook Air arrived a few weeks ago and I've been putting it through its paces.

The Basics

As pretty much anyone who has one of these computers has said, the performance is outstanding. For the most part, even with emulation, most of the tasks I do during the day feel the same as they did on my wildly-more-expensive iMac Pro. On top of that, the fact that this thing doesn't even have a fan is both a technical marvel and a godsend as far as ambient room noise is concerned.

For continuity's sake, I used Migration Assistant to bring over my iMac's environment, and everything there went swimmingly. The good-citizen apps I use like MarsEdit and Tower were already ported to ARM, while the laggards (unsurprisingly, the ones made by larger companies with more resources) remain Intel-only but run just fine in emulation.

Hardware

For a good while now, I've had the iMac screen flanked by a pair of similarly-sized but far-inferior Asus screens. With the iMac's lovely screen out of the setup for now, I've switched to using those two Asus screens as my primary ones, with the pretty-but-tiny laptop screen sitting beneath them. It works well enough, though I do miss the retina resolution and general brightness of the iMac.

The second external screen itself was a bit of an issue. Of themselves, these M1 Macs, either for good reason or to mark them as low end, support only two screens total, the laptop screen included. So I ended up ordering one of the StarTech DisplayLink adapters. I expected it to be a crappy experience overall, with noticeable lag, but it actually works much more smoothly than I'd have expected. Other than the fact that it doesn't support Night Shift and some wake-from-sleep slowness that I attribute to it, it actually feels just like a normally-attached monitor.

I also, in order to regain my precious Ethernet connection and (sort of) clean up the dongle situation, I got one of these Anker USB-C docks. I've only had it for a day, but it seems to be working like you'd want so far. So that's nice.

Eclipse and Java

Here's where I've hit my first bout of jankiness, though it's not too surprising. In general, Eclipse and Java work just fine through emulation, and I can even keep running tests and web servers using the libnotes.dylib from the Notes client as I want.

I've found times where tests lag or fail now when they didn't before, though, and that's a little ominous. Compiling locally with NSF ODP, which spawns a sub-process that loads the Notes libraries, usually works, though now I've set up another Domino server on my network to handle that reliably.

I've also noticed some trouble in one of my Eclipse workspaces where it periodically spends a long time (10+ minutes) "Building" without explaining what exactly it's doing, and this is new behavior since the switch. I can't say what the core trouble is there. It's my largest active workspace, so it could be that file polling or other system-call-intensive work is just slower, or it could be an artifact of moving it from machine to machine. I'll probably scrap it and make a new workspace with the same projects to see if it alleviates it.

This all should improve in time, though, when Eclipse, AdoptOpenJDK, and HCL all release macOS ARM ports. IntelliJ has an experimental ARM port out, and I'm curious how that does its thing. I'll probably spend some time kicking the tires on that, though I still find Eclipse's UI much more conducive to the "lots of semi-related projects" working style I have. Visual Studio Code is in a similar boat, so that'll be good for the JavaScript development I do (under protest).

In the mean time, I've done some tinkering with how I could get a fully-native Eclipse environment running and showing up on my Mac, including firing up the venerable XQuartz to run Eclipse as an X client from a Linux VM in the basement. While that technically works, the experience is... well, I'll charitably call it "not Mac-like". Still, it's kind of neat and would in theory push aside any number of concerns.

Docker

Here's the real trouble I'm butting my head against. I've taken to using Docker more and more for various reasons: running app servers with a Domino runtime, running Domino outright, and (where my trouble is now) performing cross-compilation and other native-specific compilation tasks. For example, for one of my clients, I have a script that mounts the project directory to a Docker container to perform a full Maven build with NSF compilation and compile-time tests, without having to worry about the user's particular Notes or Domino installation.

However, while Docker is doing Hurculean work to smooth the process, most of the work I do ends up hitting one of the crashing snags in poor qemu, which crop up particularly with Java compilation tasks. Since compiling Java is basically all I do all day, that leaves me hoping either for improvements in future versions or a Linux/aarch64 port of Domino (or at least libnotes.so).

In the mean time, I'm making use of Docker's network transparency to run Docker on an x64 VM and set DOCKER_HOST locally to point to it. For about half of what I need, this works great: I can run Domino servers and Notes-enabled webapps this way, and I just change which address I'm pointing to to interact with them. However, it naturally removes the possibility of connecting with the local filesystem, at least without pairing it with some file-share jankiness, so it's not a replacement all around. It also topples quickly into the bizarre inner Docker world: for example, I wanted to set up Codewind to work remotely, but the instructions I found for getting started with your own server were not helpful.

Future Use

Still, despite the warts, I'd say this laptop is performing admirably, and better than one would normally expect. Plus, it's a useful exercise in finding more ways to make my workflow less machine-specific. Though I still bristle at the thought of going full Eclipse Che and working out of a web browser, at least moving some more aspects of my workspace to float above the rough seas is just good practice.

I'll probably go back to using the iMac Pro as my main machine once I get it fixed, even if only for the display, but this humble, low-end M1 has planted its flag more firmly than a MacBook Air normally has any right to.

Java Travelogue: The Care and Feeding of Locales

Feb 14, 2021, 1:37 PM

Tags: java

Over time, people using the NSF ODP Tooling project have periodically hit troubles with files using non-ASCII filenames, as well as some related encoding issues.

Now, I know what you're thinking: why don't people hitting this trouble just be Americans and not use languages with accents? And yes, obviously, that's the optimal solution. However, given that, apparently, most people on the planet are not American, it's for the best to not write software that completely falls apart when encountering an umlaut.

When working to fix this, I found some areas where the fix was pretty obvious, and others where the trouble was a bit more insidious. I figure it'll be potentially useful to write these down, either for others running into similar trouble or my own future self next time I write overly-American code.

Early Encounters: ZIP Files

The earliest place people encountered trouble was with the handling of ZIP files when transferring packages around. When compiling remotely, the local Maven plugin ZIPs up the ODP and related support files (OSGi sites, etc.) for transfer to the server, which then unzips them. This led to a problem wherein the handling of file names in ZIP files is wildly inconsistent over platforms and locales.

Fortunately, this one has a clean fix: when using ZipOutputStream and ZipInputStream (which were my preferred mechanisms), you can specify your encoding:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
try(OutputStream fos = Files.newOutputStream(packageZip, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING)) {
    try(ZipOutputStream zos = new ZipOutputStream(fos, StandardCharsets.UTF_8)) {
        // Add entries to the ZIP here
    }
}

// And to read:
try(InputStream is = Files.newInputStream(zipFilePath)) {
    try(ZipInputStream zis = new ZipInputStream(is, StandardCharsets.UTF_8)) {
        // Iterate over entries here
    }
}

Since I control both sides of the operation in this case, I can then be confident that it will use UTF-8 across the board.

Next Problem: Filesystem Restrictions

The next problem I ran into actually happened when I was setting up a compiler server in a Docker container. One of the design elements in the example projects is an agent containing umlauts, based on a reported problem. When I tried compiling this project in a Docker-housed Domino server, I ran into this trouble:

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: Code/Agents/Example Agent with ref?r?ns.fa
    at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
    at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
    at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
    at sun.nio.fs.AbstractPath.resolve(AbstractPath.java:53)
    at org.openntf.nsfodp.compiler.servlet.ODPCompilerServlet.expandZip(ODPCompilerServlet.java:241)

Basically, it was trying to write out what it considered an illegal filename and choked on it.

I first spent some time double-checking my ZIP handling, since I was assuming that the trouble was that the name it got out of the ZIP file was corrupted, hence the "?" instead of "ë". This search brought me to this Stack Overflow question, which is asking about the same exception and which talks about the locale of the underlying system. The gist of it is that Java uses a semi-standard property (sun.jnu.encoding) to interpret a lot of things, filename mapping included, and it derives this from the system locale.

I hopped into the Domino container to see what locale it uses (by way of echo $LANG) and saw that it's "C.utf8". I like the sound of that "utf8" part, but the "C" part is different from the comfy "en_US" that I'm used to, and likely causes Java to be more restrictive. Uncharacteristically, the typical "en_US" setup actually avoids this trouble, causing Java NIO to allow all sorts of characters in filenames.

So I started seeing what I could do by way of setting ENV variables as part of the Dockerfile, but then realized that it'd be better to fix this in a way that doesn't depend on external configuration like that.

Java NIO

Here I realized that I didn't actually need to write these files out to the filesystem at all. Over a year ago, I wrote part 1 of an unfinished series talking about the Java NIO filesystem API from Java 7. That API exists for a number of reasons, and the best way to dive into it is to replace your uses of java.io.File, java.io.FileInputStream, etc. with it, which I did in the NSF ODP Tooling a while ago.

What struck me, then, was that this earlier work also separated out the specifics of filesystem access. And, critically, Java ships with a ZIP file system provider that lets you point at a ZIP or JAR file and treat it like any old filesystem. The on-disk project representation I wrote for the compiler uses this NIO API as its entrypoint. By skipping the step of extracting the ODP from the ZIP to the filesystem, I could remove that entire problem from my view.

The Fiddly Parts

This process was mostly smooth, but there are a few fiddly parts that I had to account for:

  1. You have to use newFileSystem when you crack open a ZIP this way, rather than trying to open it by "jar:file" URL directly. Additionally, you have to pass a Map of options including "create":"true" to make it work.
  2. Paths.get, which is a common mechanism for creating either a full or relative path, is a bit insidious. Since those paths are created using the default system filesystem, you can't just pass them to methods like resolve for paths created from another filesystem type. Accordingly, I replaced uses of that with methods based on a context filesystem.
  3. Nested ZIPs aren't supported. That is, they exist like other files in there, but you can't reach further inside of them with a "jar:jar:file" URL. So, when building the classpath for compilation, I have to extract them. I suppose this part is technically a bug if those files have non-ASCII names, but that's rare enough to hopefully not be an issue.

Once I dealt with those, though, things went surprisingly smoothly. I even refactored earlier code to use this, replacing more-complicated streaming logic with conceptually-simpler file-copying logic. My guess is that this new route is slower, but the difference is negligible for my needs, so I'll take the higher abstraction here.

Stream Locales

Unfortunately, while that helped a bit and is definitely conceptually neat, it didn't solve all my trouble. If I recall correctly, at this point, I was able to get the file imported, but the agent name itself was mangled in Notes, something that didn't happen when I compiled it locally.

This brought me to looking into locales used when reading and writing XML from the ZIP or filesystem. Hypothetically, I had done this cleanly. My file-reading utility methods were very similar, just opening up an InputStream (which is too low-level to care about encoding) and passing it along to IBM Commons utilities to interpret it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
public static String readFile(Path path) {
    try(InputStream is = Files.newInputStream(path)) {
        return StreamUtil.readString(is);
    } catch(IOException e) {
        throw new RuntimeException(e);
    }
}

public static Document readXml(Path file) {
    try(InputStream is = Files.newInputStream(file)) {
        return DOMUtil.createDocument(is);
    } catch(IOException | XMLException e) {
        throw new RuntimeException(e);
    }
}

However, I realized that these were insidious traps, too. By not handling encoding on my side, I was leaving it up to the internals to pick a default encoding, which isn't guaranteed to be UTF-8 (even though it really should be for XML). StreamUtil.readString there has a variant that takes an encoding as the second argument, but I decided to instead handle this one step earlier. Rather than using InputStream, which deals with bytes directly, I decided to switch to Readers, which are more specialized for dealing with character sequences. The Files class provides methods to do this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
public static String readFile(Path path) {
    try(Reader r = Files.newBufferedReader(path, StandardCharsets.UTF_8)) {
        return StreamUtil.readString(r);
    } catch(IOException e) {
        throw new RuntimeException(e);
    }
}

public static Document readXml(Path file) {
    try(Reader r = Files.newBufferedReader(file, StandardCharsets.UTF_8)) {
        return DOMUtil.createDocument(r);
    } catch(IOException | XMLException e) {
        throw new RuntimeException(e);
    }
}

This way, it's explicit what I'm doing, and it allows for extra optimization at the NIO level if possible.

Writing Back Out

These rules also applied to writing back out. For the most part, Files.newBufferedWriter(..., StandardCharsets.UTF_8) was the way to go, though I did find one extra insidious bit:

1
2
3
try(PrintWriter writer = new PrintWriter(os)) {
    // ...
}

Here, PrintWriter doesn't have a character-set argument at all, and so one could be forgiven (hopefully) for just kind of assuming it'll use Unicode. However, delving into the implementation, it uses OutputStreamWriter's no-charset constructor, which in turn calls Charset.defaultCharset(), and there's your potential bug. Since I didn't actually need PrintWriter as such, I replaced this with a charset-specific call and all was well:

1
2
3
try(Writer writer = new OutputStreamWriter(os, StandardCharsets.UTF_8)) {
    // ...
}

Overall

I felt that this was a pretty good exercise to perform, not just because it'll be immediately useful for NSF ODP, but also because it's a good reminder to be more diligent about character encoding. And it's also just a good lesson for two critical parts of programming: take the higher abstraction when you can and be as explicit as possible in your intent.

By switching to using the ZIP filesystem implementation, I was able to remove an entire step and problem domain from my plate. Now, the code that reads and writes filenames server-side should be able to run on basically any locale setting, without concern for the restrictions of the filesystem (within reason). The code is simpler, the operations are the same whether it's working with the filesystem directly or not, and reading the ZIP'ed ODP should actually be slightly more efficient.

And for the rest, explicitly picking your character set is just good practice. Even in a case where the documentation says that it will default to UTF-8, I think it's better to do it this way, so anyone reading your code can see what you're doing without resting on implied behavior. Certainly, you can be too explicit in places where relying on natural behavior makes sense, but this highlighted that character sets aren't one of those cases.

A Simpler Load-Balancing Setup With HAProxy

Feb 5, 2021, 3:32 PM

...where by "simpler" I mean relative to the setup I detailed six years ago.

For a good long time now, I've had a reverse-proxy + load balancer setup that uses nginx for the main front end and HAProxy as an intermediary to do the actual load balancing. The reason I set it up this way was that I was constrained by two limitations:

  • nginx's built-in load balancing didn't do sticky sessions like I needed, which would break server-side-state frameworks like XPages
  • HAProxy didn't do HTTPS

In the intervening half-decade, things have improved. I haven't checked on nginx's load balancing, but HAProxy sprouted splendid HTTPS capabilities. So, for the new servers I've been setting up, I decided to take a swing at it with HAProxy alone.

Disclaimer

Before I go any further, I should point out that this is only a viable solution because I would otherwise use nginx only for being the HTTPS frontend. In other cases, I've used it to host files directly, run CGI scripts, etc., and it'd be best to keep it around if you want to do similar things.

Basic SSL Config

The "global" section of haproxy.cfg contains settings for your TLS ciphers and related parameters, and Mozilla's config generator is your friend here. Today, I ended up with this (slightly tweaked to generate dhparams locally):

global
	#
	# SNIP: a bunch of default stuff
	#
	crt-base /etc/ssl/private

	# See: https://ssl-config.mozilla.org/#server=haproxy&version=2.0.13&config=intermediate&openssl=1.1.1d&guideline=5.6
	ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
	ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
	ssl-default-bind-options prefer-client-ciphers no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

	ssl-default-server-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
	ssl-default-server-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
	ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
	
	# sudo openssl dhparam -out /etc/haproxy/dhparams.pem 2048
	ssl-dh-param-file /etc/haproxy/dhparams.pem

Though arcane, that's fairly standard stuff for TLS configuration.

Frontend Config

Years ago, my original config put everything in a listen block, but it's properly split up into frontend and backend now. The frontend block is pretty simple:

frontend frontend1
	bind *:80
	bind *:443 ssl crt star.clientdomain1.com.pem crt star.clientdomain2.com.pem alpn h2,http/1.1
	http-request redirect scheme https unless { ssl_fc }
	default_backend domino

HAProxy's configuration file is almost painfully terse, but at least this part ends up readable enough. I bind to ports 80 and 443 on all IP addresses, and then provide multiple certificate files to be picked based on SNI. Conveniently, HAProxy does a nice job of just picking the right one, and you don't have to explicitly match them up with incoming host names.

One oddity here is the particular format for those ".pem" files. HAProxy expects the actual certificate, its chain, and the private key to all be concatenated together. This is as opposed to nginx, where the chain and private key are two files, or Apache's split into cert+chain+key files. It's also very explicitly not a PKCS file, which is the more-common way to package a key in with the certs: there's no encryption and no password assigned for this.

Additionally, I just put the base names for the files there because they're in /etc/ssl/private, as configured in global.

Back to the rest of the configuration: the http-request line does the work of auto-redirecting from HTTP to HTTPS. Again, very terse, and it's using the ssl_fc configuration token to check if the incoming connection is SSL.

Finally, default_backend domino ties in to the next section.

Backend Config

The backend configuration is the meat of it:

backend domino
	balance roundrobin
	cookie backend insert httponly secure
	option httpchk HEAD /names.nsf?login HTTP/1.0
	http-request add-header $WSRA %[src]
	http-request add-header $WSRH %[src]
	http-request add-header X-ConnectorHeaders-Secret 12345
	
	# "cookie d*" = set and use a cookie to tie to the backend
	# "check" = I don't know, but I assume it checks something
	# "ssl" = Connect to the backend with SSL
	# "verify none" = Don't bother with SSL verification checks
	# "sni ssl_fc_sni" = Use the incoming SNI hint when connecting to the backend
	server domino-1 domino-1.client.com:443 cookie d1 check ssl verify none sni ssl_fc_sni
	server domino-2 domino-2.client.com:443 cookie d2 check ssl verify none sni ssl_fc_sni

The balance roundrobin and cookie ... lines tell HAProxy to cycle through the backends for incoming connections, but to stick the client with a specific backend server based on the value of the backend cookie, if present, and then to set it in the response. That covers our sticky sessions.

The next line, option httpchk HEAD /names.nsf?login HTTP/1.0, tells HAProxy how to check the health of the servers. This should be something very inexpensive that's also a reliable way to tell if the server is working. I went with asking for headers for the default login page - something all Domino servers (with session auth) will have and which doesn't risk running application code like / might.

The next three lines are my beloved Domino connector headers, plus the shared secret from my locking-down DSAPI filter (I mean, it's not the actual shared secret, but that's where it goes). Note that I don't need to include $WSSN to denote the requested Host value, since HAProxy passes that along by default.

Finally, there are the actual backend configuration lines. Because the load balancer is communicating with Domino via SSL, I tell it to do so and to not bother validating the certificates. Additionally, I tell it to pass along the incoming SNI hint to Domino, which, since Domino finally supports SNI, routes the request to the correct web site on the Domino site.

If you were to connect to the Domino servers via HTTP, you could snip off a bit from those lines and add http-request add-header $WSIS True above.

Conclusion

I haven't actually put this into production yet, so the details my change, but I'm thoroughly pleased that I can simplify the configuration a good deal. I've found learning about how to configure HAProxy a little less pleasant than learning about nginx, but part of that is just learning some of the terminology and how to navigate the documentation - it's all there; it's just a little arcane.

XPages: Dealing With "Cookie name X is a reserved token"

Feb 3, 2021, 10:49 AM

Tags: xpages

The other day, John Dalsgaard asked a question in the XPages Slack Community to do with an exception that a client was seeing when going to any XPage:

java.lang.IllegalArgumentException: Cookie name ""categories":"[\"performance\",\"unclassified\",\"targeting\",\"functionality\"]"" is a reserved token
	at javax.servlet.http.Cookie.<init>(Cookie.java:144)
	at com.ibm.domino.xsp.bridge.http.servlet.XspCmdHttpServletRequest.parseCookieString(XspCmdHttpServletRequest.java:349)
	at com.ibm.domino.xsp.bridge.http.servlet.XspCmdHttpServletRequest.getCookies(XspCmdHttpServletRequest.java:283)
	at com.ibm.domino.xsp.bridge.http.servlet.XspCmdHttpServletRequest.readSessionId(XspCmdHttpServletRequest.java:185)
	at com.ibm.domino.xsp.bridge.http.servlet.XspCmdHttpServletRequest.<init>(XspCmdHttpServletRequest.java:156)
	at com.ibm.domino.xsp.bridge.http.engine.XspCmdManager.service(XspCmdManager.java:256)

As the uncharacteristically-short stack trace implies, this happens long before any actual XPage code in an NSF. What's going on here is that something - possibly a too-clever-for-its-own good script - set a cookie using a JSON value so that it can store structured data. However, this is kind of an illegal thing to do: by the spec, commas are reserved in the Set-Cookie header and, by virtue of the shared cookie-octet part of the spec, are also illegal in the client-sent Cookie header.

Who Is Wrong Here?

And actually, as I type, I'm starting to blame XPages less for this: commas in HTTP headers indicate multiple wholly-distinct values. For example, take an Accept header, like:

text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

The commas there indicate distinct values according to the HTTP spec itself, while the semicolons are just an idiom used by the Accept header.

The Cookie header doesn't make use of this meaning of the comma, instead relying entirely on semicolons for some reason. Still, HTTP-wise, it seems that a server should treat this:

Cookie: foo=[bar,baz];othercookie=hi

...as equivalent to this conceptual version:

Cookie: foo=[bar
Cookie: baz];othercookie=hi

... which then should break, as "baz];othercookie" is wildly illegal in the rules for tokens because "]" and ";" show up in the separators list.

Long story short, unencoded JSON is extremely likely to run afoul of all sorts of rules here, and ideally the browser wouldn't send a header like that in the first place.

The Workaround

The XPages developers were aware of this, but made the fix an opt-in thing at the server filesystem level. To avoid this specific trouble, go to the "xsp" directory in your Domino program directory (not the data directory), create a file named "bootstrap.properties", and set its contents to:

1
xsp.commas.not.delimiters.in.cookie=true

To my knowledge, the only "documentation" that exists for this is an incidental mention in the XPages Portable Command Guide, where the property being false by default shows up in the sample output from running tell http xsp show settings on the console. Fortunately, once you know that it exists, the name is pretty self-documenting, and it does just what it says on the tin.

As with other server configuration options, I think this should be configurable at the NSF level, and should at the very least be something configurable in the data directory. Doing anything in the program directory only gives me the willies. The stack should also give a better error earlier, rather than relying on the Servlet Cookie class to balk at the malformed name.

In any event, if you have a case where you're using a library or same-domain-server app that sets a header like this, this property should help.