Java Community Antipatterns, an Ongoing Series

Published at 17:27 on 1 September 2023

To give you an idea of the general pathetic hilarity of the situation, I was reviewing some code at work today. It reads in a message from Kafka, obtains a validator object, and calls that object’s isValid() method on the message it receives. That method in return a ValidationResult object, whose valid() method is then called inside an if statement.

This immediately strikes me as odd. When you validate something, it either turns out to be valid or invalid. That’s it. Two options, no more, no less. Yes/no. Black/white. On/off. There is no need to create a new data type to represent a validation result, because a perfectly appropriate data type already exists, built in to Java: the Boolean. Just use that. Far simpler and cleaner.

Maybe the ValidationResult object does something special and has extended features beyond those of a Boolean? Yes, it has a message field! But wait, that field is never accessed. The only thing that is ever done with that object is to call the valid() method, whose purpose is to return the Boolean value that should have been used in the first place.

And what of the validator object? Its class definition is very simple, just one short method that makes some basic checks. If its argument turns out to be invalid, the message part of the result is set to the string “Data is not valid.” No, I am not making this up. Of course the data is not valid, you moron! That is why the valid flag is set to false! This field conveys exactly zero meaningful information.

What other code uses this validating logic? None of it, it turns out! So there was no need for the validator class, either. Could have just added a private isValid() method inside the one (short) source file where this logic is used. Would have been a whole lot clearer, because the person reading the code wouldn’t have had to open another file to determine just what the validation logic is.

So three classes, and three source files, are being used where just one would have sufficed.

Now, this was a particularly egregious example, but this sort of crap-ola happens over and over (and over) again in Java code. Needless complexity everywhere.

Cutting Over to LaTeX

Published at 15:15 on 20 August 2023

The above use of StUdLy CaPs courtesy of a community somewhat enamored of them. An overly-cute quirk that for a long time made me shy away from that document preparation system. It’s silly, to be sure, and hardly the No. 1 reason, which is that I had by that time:

  • Already had learned troff, which provides the same general functionality,
  • Often I did not have access to a laser printer, and troff shares an input language with nroff, which can produce passable output on a simpler and at the time much more common typewriter-like printer. TeX and LaTeX, by contrast, are useless if you don’t have access to a laser printer, phototypesetter, or graphics display.

Time passes. LaTeX grows to be way more popular than poor old troff. Users deveop and share all sorts of macros to do just about anything you want. Those same users have many online forums that can be searched if you get stuck or puzzled. troff gains nothing equivalent. Get frustrated buy troff‘s inability to set text that wraps around illustrations (a standard book publishing technique).

So yeah, it was time. The final push that made spend most of a day reading Knuth’s definitive description of his program was my resumed job search, the desire to make a résumé that breaks a few of the rules (using some of my favourite fonts), and the difficulty of getting good font support for troff. By contrast, there are modern versions of TeX that can read standard font files.

(Yes, I know there are modern versions of troff that can apparently read standard font files. The rub is, they lack certain extensions to standard troff in the version I have been using, extensions that I am making use of, so I would have to rewrite anyhow. Plus. troff still can’t format output by wrapping around illustrations. Why not rewrite in the more powerful and full-featured alternative?)

Anyhow, I’ve now learned enough of LaTeX to make it do what I want to do… for now, which was the goal.

The Go Programming Language: Too Minimalist

Published at 18:55 on 15 August 2023

I first ran into this many years ago, when Go was first released by Google and I played with it a bit. I ran into a situation where a ternary (aka conditional) operator would be handy, only to discover that it doesn’t exist in Go. OK, then, maybe I can use an if statement as an expression? Sorry, in Go if is strictly a statement and not an expression. So I sighed, introduced an extra variable into my code that otherwise would not be needed, and coded an if / then / else statement.

Then earlier this year I run into an issue with Helm, which is written in Go. I was trying to use it to generate a YAML file from a template, and was getting tripped up because Helm was sometimes inserting newlines into the output. This can happen in Python, too, but there is a way to disable the feature. Surprisingly, there was no way to disable it in Helm, because Helm of course uses the Go YAML library, and that feature is absent there. You will get your gratuitous newlines, and you will shut up and learn to like it. Lovely.

Maybe those two experiences should have been sufficient to warn me, but no. I was determined to give Go another chance. So I decide to revisit the language by rewriting a Python program in it.

That program’s purpose is to detect and report characters and byte sequences in source code files that are likely to be troublesome. My motive for writing it was when I once wasted most of my day trying to find a Java bug caused by a Unicode zero-width space in a source file. So it ends up being a pretty good exerciser of a language’s character set conversion libraries.

Again, Go fell short in the feature department. One of the Unicode character classes is Cn, unassigned code points. Such things, being undefined in behavior by the Unicode standard, are nothing but trouble and so should be reported as troublesome. But wait! Go does have a library that defines Unicode character classes, but for some reason the Cn class is missing. It’s possible to code around this (with some lossage) so I do.

Then I run into another problem: detecting invalid byte sequences in an input file. Go simply replaces these with the Unicode replacement character, code point U+FFFD. But what if a source file has such a character in it, then what? In Go, it is impossible to tell if the character is actually there (and sometimes I might want to allow it, and let it silently be accepted), or if it was inserted by the appropriate encoding.Decoder method in response to encountering an invalid byte sequence (which should definitely always be reported).

In Python or Java (and I assume most other languages), it is no problem: one can tell the decoder to signal an error on an invalid byte sequence if returning replacement characters is somehow unacceptable. Not so in Go. You will get replacement characters, and you will shut up and learn to like it.

It’s still possible to write the program, of course. It just ends up being unnecessarily quirky, failing sometimes to detect unassigned characters, and raising false alarms about bad byte sequences if a file contains completely valid replacement characters in it.

But why? Why choose a programming language that breeds quirks and unexpected surprises? Such things are bad, whether they arise from excessive complexity or excessive minimalism.

And these are not the only examples I have run across (both in the last few days and earlier), either; they are merely the ones that fit best into the story I have related here.

It all makes me appreciate more just what a good job Guido did in striking a good compromise between complexity and minimalism in the Python programming language.

Java Community Antipatterns, an Ongoing Series

Published at 08:35 on 12 August 2023

So I’m starting to play with the Go programming language again, mainly because in many ways it’s the anti-Java (it is not even a fully object-oriented language, by design). It was written mainly by Rob Pike, who is part of the old Bell Labs software culture that is generally skeptical of object-oriented programming, for many of the same reasons that I have come to be skeptical of it. One of those reasons is that excessive reliance on the object paradigm tends to breed unnecessary complexity.

Generally, when I have a question, I can find an answer in the documentation fairly easily. This morning, I had a build system question I could not easily find an answer to. So I decided to figure it out by looking at what the developers of a well-known, open-source software project in Go had done. I chose Helm.

It only took a few minutes of examining the source code to figure it all out. With Java, I would have pissed away half a day, easy. Instead of a Makefile, there would have been pom.xml (Maven) or build.gradle (Gradle). Both are incredibly complicated compared to Make, and both inevitably involve the use of multiple plugins that are also incredibly complicated. I would have been combing through documentation and scratching my head for hours.

Instead, boom! Answer obtained, in the space of a few minutes. The way it should be.

And Helm is not a small or simple project. In other words, despite its simplicity, Go seems to be every bit as powerful and useful a language as Java. More powerful and more useful, in fact, since it is easier to use, and one can spend time focusing on designing software instead of battling the (unnecessary) complexity of the overall programming environment.

But wait, there’s more! Just like the gratuitous complexity of the standard Java class library has proven contagious, the clean simplicity of the base Go programming environment seems equally contagious. I decided to satisfy a bit of intellectual curiosity about how Helm did something. This is something that frequently takes me hours with a Java project. But not here! Within a minute, I found the relevant bit of source code and my question was answered.

Which, again, is the way it should be.

On (Not) Being a Java Careerist

Published at 07:40 on 29 July 2023

Not to slam Java careerists. One thing they are is very smart and talented. One just has to be, in order to deal with all the gratuitous complexity bred by the traditions of that programming community.

But here’s the thing: I don’t want to devote basically all of my mental effort to doing that. I don’t want to lose my botanical knowledge, or my wide-ranging general scientific knowledge. And I would have to in order to succeed in the Java world. The mental load is just so extreme.

Even if I wanted to, I am not sure I could. I crave knowledge in a diversity of subjects. My mind would rebel, strongly, against being forced to hyperspecialize.

In a sense, this means I’m “lazy” in that I “don’t want to work very hard” at software software development. But I don’t see that as necessarily a bad thing. Why should I work harder than necessary? If there is an easier way to do a good job at something, why not choose the easier way?

Is it really intelligent behavior to continue doing something in a difficult way when you are aware that an easier way exists?

This all was, in fact, something I wondered a bit about going into this job. And I decided then that if this was the case, I wouldn’t succeed at the job, wouldn’t want the job, and would end up departing from it. And so here I am.

Why I Hate Java: An Example

Published at 20:46 on 28 July 2023

Building on this entry, let us relate a little story that transpired in the past week.

About a week ago, I make a stupid error and introduce a bug into the code. Shouldn’t be a big problem; one good thing about where I work is that there is a very extensive battery of tests for things.

But this is the Java universe we are talking about. Simplicity is not appreciated as a virtue. Both the test and build frameworks are ginormous and hypercomplex. Somehow, I still do not know why, some feature got triggered that caused the test(s) that would have detected my bug to fail to run.

Because the Java universe does not appreciate simplicity, that code base itself is ginormous and hypercomplex. If the code were not written by members of such a dysfunctional programming culture, it would have been broken up into smaller, more managable bits that communicated with each other somehow. The test log for the subset of the code I was working with would have been short enough I would have probably noticed something missing. Instead, the missing tests were buried in a little over 80,000 lines of test and build output. Can you read an 80,000 line log file without falling asleep first? I sure can’t. So I didn’t even try. Naturally, the missed tests go unnoticed.

The check-ins get rejected for other reasons, so I get to work on addressing them. Meanwhile, the whatever-it-was that caused the critical tests to get suppressed ceases to do so. So my first attempt to test the recode fails for this out-of-the-blue, off-the-wall reason. I look at my recent changes and see nothing that could cause this issue to manifest.

Not much can be done but to attempt to instrument the daylights out of the code with debug log statements and try to figure out what the heck is going on with the data as it gets operated on.

My first attempt to do so fails because the company’s network infrastructure suffers a hiccup and causes my build to fail. The company’s infrastructure is super-complex, poorly-documented, and unreliable. (Everyone else just basically accepts it because everyone else is a Java programmer and thus used to unnecessary complexity and the resulting unreliability.)

In my second attempt, I discover that for some reason the test framework suppresses all log messages. So I recode to use writes to standard output, figuring (correctly) that it won’t “intelligently” suppress those. The instrumenting turns out to be insufficient, so I add more.

Each of these iterations takes way, way longer than it should, because the code is big and bloated and complex and so takes 30–45 minutes to build. If everything was factored into smaller units, I doubt builds would take longer than 5 minutes (if that). So each iteration takes roughly 6 to 10 times longer than it should.

Finally, after at least 4 hours of effort, I locate the bug.

And this is why I hate Java. Not because of the core language itself (dated, but still not bad considering it was designed in the ’90s) or because of its runtime (still one of the best virtual machines out there), but because of the traditions of the community that uses it. A minor bug, that would have been resolved in half an hour easily, instead almost makes it into production and takes half a day to resolve.

And this happens everywhere, all the time. Everything is more difficult, more tedious, and more error prone than it should be, with a lot more busy work than there should be.

Those dysfunctional traditions are such an irritant that I have developed my own special term for them: Java community antipatterns, or JCA’s for short.

I have recently learned that I am on my way out where I work, mainly because I can’t cope with the JCA’s as well as the Java careerists. And frankly, I can’t wait till I move on. I’m already looking for another position, and it will be as far from the enterprise Java world as it can be.

Java Annoyances

Published at 07:22 on 29 May 2023

When Java first came out in the 1990’s, I gave it a try, then turned away from it. My reason was not the core language itself but its standard library, which impressed me as something of a poorly-organized and overcomplex mess.

Decades later, and with some professional coding experience in that language ecosystem under my belt, and that is still basically my takeaway conclusion. The worst that can be said about the core language is that it’s a bit dated (understandable, as the the design is now decades old). But the overall pattern of the standard class library being awkward has extended to the language ecosystem as a whole.

Just about every third-party library for Java tends to be a special combination of big, awkward, and given that size and ponderousness surprisingly feature deficient. Take the Jackson JSON library for instance. Its current release totals just shy of 800 classes (yes, 800, I am not making this up). Yet when I tried to do something as simple and basic as generate pretty-printed output (nicely indented and formatted, with all keys in JSON objects sorted alphabetically), I couldn’t do it out of the box. (There is an ORDER_MAP_ENTRIES_BY_KEYS option, but it fails to act as advertised in all — in fact, in most — cases.) I had to write helper methods to get my output formatted as desired.

And this was after blowing most of a day poring over documentation and trying experiment after experiment attempting to get my output correct. The configuration settings in Jackson are split up amongst at least three classes, and of course the documentation for one configuration class does not mention the others. It is left as an exercise for the programmer to discover the others.

Contrast with Python, which has a simple JSON serializer and deserializer built-in to the language’s core library. (Jackson is a third-party library, because in Java you must use a third-party library if you want to read or write JSON; the standard Java library lacks JSON support. This, despite the standard Java library being much larger in terms of number of classes than Python’s library.) And there is no hunting the documentation in Python: right out there in the documentation for the json module (one module, one class, one HTML page of documentation to read, that’s it) the indent and sort_keys options to json.dump are described. And the options work as advertised in all cases! What takes over a day to code in Java can be accomplished in under a minute in Python.

Yes, Jackson can do deserialization into objects, with schema checking, and the built-in Python library cannot. That’s nice, dear. The basic functionality of being able to generate pretty-printed output out of the box seems to be missing. It’s like driving a luxury car with heated seats and a fancy entertainment system but no factory headlights or taillights, so you must add those if you want to drive it after dark.

And I run into this sort of thing over and over and over again. In the Java world, I am literally always encountering this or that use of some giant, cumbersome, poorly-documented third-party package, that compels me to waste multiple hours understanding it. Or, in most cases, just partially understanding it and still making a huge number of educated guesses about it. And because those packages also tend to be surprisingly limited in functionality, one either has to pull in more huge, cumbersome, weak libraries to make up the deficiency, or add more lines and complexity to the code base.

It all ends up sending the cognitive complexity of understanding what a Java program does into another whole universe of mental difficulty.

It’s a real shame, because as I said the core Java language really isn’t too bad at all. And the core Java runtime environment is, by any objective measure, great: garbage-collected, platform-independent, with full support for preemptive multi-threading, and with a portable graphical user interface that (with a little programmer effort) manages to replicate the native look and feel on all three of Windows, Macintosh, and Linux.

But oh, those library antipatterns. They do so much to take away from the overall experience.

And We’re Back

Published at 17:06 on 22 May 2023

Ubuntu Linux package manager badly botched a routine upgrade and hosed my database. Thankfully I take routine backups. It just was a matter of time until I could perform the necessary restore.

Unix, the Alarm System Call, and Power Management

Published at 20:37 on 25 April 2023

And by “Unix” I include Unix-like operating systems like Linux and MacOS. In fact, my experience is limited to Linux and MacOS in this regard, but I would be surprised if the various BSD and System V Unix systems out there with automatic power management differ much.

I have a simple alarm clock/reminder script I wrote in Python. The heart of it was the following logic:

def sleep_until(then):
    delta = then - time.time()
    if delta > 0.0:
        time.sleep(delta)
        return True
    return False

Now, the time.sleep call in Python is implemented as a call to sleep in the C standard library, which in turn is implemented as via the alarm system call. All of these accept an offset in seconds, which in the former case specifies the amount of time to sleep, and in the latter the amount of time before an alarm signal is delivered to the process.

The logic above is simplicity itself, yet from time to time my reminders would come in late! Eventually, I linked it to those times when the system suspended itself due to lack of activity for a while; and my alerts were late by an amount that corresponded with the time the system was suspended. Apparently, when Unix and Unix-like systems suspend themselves, time as specified to alarm ceases to pass; that system call only counts seconds that transpire when the system is awake.

The cure is to break up the sleeping into chunks, and to repeatedly check the system clock:

MAX_SLEEP = 60.0
def sleep_until(then):
    delta = then - time.time()
    if delta <= 0.0:
        return False
    while delta > 0.0:
        time.sleep(min(MAX_SLEEP, delta))
        delta = then - time.time()
    return True

At least, this seems to work. I implemented the change yesterday and alerts that spanned times when my computer was asleep got raised at the correct time. It’s a little ugly to replace a blocking with busy-waiting like this, but although the above logic busy-waits, it still spends most of its time blocked.

Note that this seems to affect other programs as well. In fact, one of my motives for writing this script was the frequent failure of the Gnome clock app to issue alarms at the proper time.

Note also that this assumes the computer will be in an awake state at the time the alert is scheduled. If the computer goes to sleep and stays asleep, it will issue no alerts during the time it is asleep. Remedying this state of affairs requires privileged system calls that one must be careful making. I decided that the safety of having a nonprivileged program was worth the slight chance of a missed alert; in my case, the problem almost always happens as a result of a system suspending itself on lunch break, with the alert time being while I am at my desk in the afternoon.

Where the Rust Language Makes Sense

Published at 19:48 on 19 April 2023

Per this, I think Rust makes the most sense for things you would have otherwise written in C or C++. It is a more modern, relatively low-level, language than either of these two (and is much cleaner than C++, which was an attempt to add all sorts of extra features onto C, and which suffered as a result of having to be a proper superset of that earlier language).

If you were not going to write it in C/C++, in other words if computing resource limitations are not a constraining factor, then writing it in Rust just doesn’t make sense. Use some other programming language with automatic garbage collection, so you don’t have to worry so much about memory management.

Which means, that for other than embedded systems, it is generally stupid to use Rust from the ground up. Use a higher-level language like Python. If the higher-level language proves too slow or too memory-inefficient, do some profiling, find the weak links in the chain, and rewrite those in Rust instead of rewriting them in C/C++. There’s already libraries out there to facilitate doing the latter.

And that is why I can’t feel much love for Rust: because I am right now not running up against any resource constraints that make Python, Java, or Kotlin impractical.