A Tantrum Is Not a Software Design Principle

Published at 09:11 on 16 June 2022

Or rather, it should not be considered one. Alas, sometimes it is.

I first ran into this when I encountered the Git revision control system. It was just so disheveled, haphazard, counterintuitive, and disorganized compared to the older revision control systems I was used to. It was even so compared to other “third generation” systems like Mercurial. It still strikes me as such, and is the main reason I self-host my personal projects using Mercurial.

Then I read that a major inspiration for Linus Torvalds to write it the way he did was just to be different from CVS, because he hated CVS. Design by tantrum, in other words.

Around a decade ago I became aware of the then-experimental Go programming language. A lot of it sounded like precisely what is needed: a more modern language than C++, without all the design cruft caused by attempting to graft a modern object-oriented language onto a non-OO systems programming language from the 1970’s, that compiles down to machine code for fast execution.

But, wait. No exceptions, despite how useful they were. No generics (a deficiency that has been fixed). No conditional expression. No support for functional programming. And, due to overall design choices, no easy way to add such support. It all forced one into a very staid, unimaginative, and limited-productivity imperative programming style.

Now I read that one of the motives for making it all that way was that the language’s designers all hated C++ (I can’t blame them) so decided to make the language as unlike C++ as they could whenever faced with a decision they couldn’t sort out via alternative principles. More design by tantrum.

Not everything about CVS was bad. Just because CVS organized its commands and concepts the way it did does not make this organization bad. CVS’s problems lay in a centralized design and poor support for branching and merging. Its overall paradigm organization was actually pretty good. Unfortunately, a temper tantrum about the bad parts of CVS prevented Git’s author from appreciating its good aspects.

Likewise, not everything about C++ is bad. Just because C++ has some feature whose absence can be coded around does not mean said feature is harmful and should be deleted. C++ was, after all, motivated by a very real desire to make C into a more powerful and expressive programming language.

The problem is that choosing to focus on how bad something is can easily get one into a mindset that prevents appreciating the good aspects of it.

Resolved: while disgust at the inadequacies of some existing software system will always serve as the inspiration for attempting to create something better, a tantrum should not be allowed to be a design principle.

Making GUI Apps for Linux, Macintosh, and Windows

Published at 21:02 on 3 April 2022

Foreword

I collected this information about two years ago, intending to publish it here, yet never did. Before I lost track of it, I decided to do so this evening. Note that this information was correct as of two years ago; so there is a chance that some things have changed since then. (Though I am at present revisiting these topics, and so far I haven’t found any changes in what follows.)

Introduction

I’ve been mostly a back-end programmer and command-line guy. That, plus inertia, has caused me to not bother with supporting graphical user interfaces in the code I write.

Until recently, that is. There’s a few ideas I’ve had rattling around in my head that would be useful for others, but many of those other people are not computer geeks and would not be interested in opening a Terminal or Command Prompt app just to run my command-line programs.

As a result, I’ve been learning how to make normal, “clickable” apps that a normal person would be able to run without extensive training and hand-holding. Might as well share that knowledge with others.

What follows is likely to be particularly useful to those who, like me, are using something other than the normal, approved tools to code their apps. In my case, I’m using Kotlin, because it’s a modern language with a powerful, expressive syntax and it runs under the Java virtual machine, making it much easier to port my code to different systems.

Yes, there are “standard” programs to do all of the following. Without very few exceptions, I have found them to be poorly documented and geared to C/C++ developers. I found that attempting to bend these tools to my will was sufficiently difficult and painful that it was easier to forget about them and just do it all myself, mostly from scratch.

Linux (Gnome on Ubuntu)

To have an application that comes up as a clickable icon like all the other normal Gnome apps, one must install files in a number of places, primarily under /usr/share. It’s something of a mess, as the files that define the presence of a given app are scattered here and there, not collected in one place as they are on a Mac.

The easiest way to cope with this state of affairs is to do what everyone else does: make a Debian package (Ubuntu is Debian under the hood). Thankfully, .deb files are pretty simple: an ar archive with three members:

debian-binary
This consists of the string “2.0” followed by a newline. That’s it. It must be the first member of the archive.
control.tar.gz
Files that control the installation. The control (yes, control.tar.gz contains a file named control) file is the only mandatory one, though an md5sums one is highly recommended. This must be the second member of the archive.
data.tar.gz
The files to install. They all are relative paths (to root). I.e. if your package has an executable to be installed as /usr/share/program it will be in here as usr/share/program. This must be the third and final member of the archive.

Note that all files must be stored with the correct ownership information (usually as root, the superuser).

In order for a Linux application to look like a normal, clickable Gnome app, and for lintian to not complain a lot about your package, you need to have a number of files installed (i.e. present in data.tar.gz). For the purpose of this list, name refers to your application’s name, folded to lower case:

usr/bin/name
Most clickable apps on Ubuntu are also installed so that they may be invoked via the command line.
usr/share/applications/name.desktop
This is the main file whose presence enables your application to show up on the desktop.
usr/share/doc/name/changelog.gz
A description of the changes made to the package. This is compressed from a file that has a nasty, column-sensitive format; it is described in detail below.
usr/share/doc/name/copyright
A copyright message. If it refers to one of the standard licenses described in /usr/share/common-licenses, you should not include the full license terms in this file but instead end it with a reference to the common license.
usr/share/icons/hicolor/48x48/apps/name.png
At a minimum, you must define 48 × 48 an icon in the hicolor theme. The name of this icon file must match the name of the icon described in the .desktop file; for sanity’s sake, just use your application name.
usr/share/name
If your program has any data files, these go in this directory. For example, the .jar and .class files for a Java application will live here.
usr/share/man/man1/name.1.gz
There should be a manual page for your application, which should be in compressed form.

It is best to use dpkg to inspect a few .deb files for programs similar to yours to get an idea of what you need to define. A good source of such files can be the system package cache, /var/cache/apt/arhives.

Once you have a directory tree with the files you need, it is a simple enough matter to use tar and ar to create a .deb file:

echo 2.0 > debian-binary
cd data
find * -type f -print0 | xargs -0 md5sum > ../control/md5sums
tar -c -z --owner=0 --group=0 -f ../data.tar.gz *
cd ../control
tar -c -z --owner=0 --group=0 -f ../control.tar.gz *
cd ..
ar r name.deb debian-binary control.tar.gz data.tar.gz

After creating your .deb package, it is strongly recommended that you use lintian --info to check it. In general, you should be concerned about anything flagged at the E (error) level, and at least make an effort to reduce the number of W (warning) level messages that lintian reports.

One thing I don’t worry about is complaints from lintian about the manual pages not being compressed to the maximum level: I use an Apache Ant task, not the gzip utility, to generate my compressed files, and it has no option to select maximum compression. It makes no difference to the system, and the amount of space saved by using maximum compression over the normal level is insignificant.

The Changelog File

Welcome to the year 1967. You are at a keypunch machine preparing a data deck for an IBM 360 program. Be sure to follow the correct rules for which records get punched starting in which column, or you will be rewarded with lots of error codes IEBLINTIAN later!

Your changelog deck consists of a series of cards, which contains groups of records which describe changes made to your program. Each group of records must begin with a card punched starting in column 1, of the following format:

name (version) distribution; urgency=urgency

Things in bold should be typed verbatim; things in italics should be replaced with something appropriate:

name
The name of this package, consistent with the name used elsewhere
version
Pretty obvious. The version mumber.
distribution
Until and unless your package becomes a well-established part of the core distribution, this should probably be unstable.
urgency
This will usually be low.

Following this card, you may optionally have a blank card. The details of the change are introduced by a card with an asterisk punched in column three (columns one and two must be blank); the remaining columns on this card describe the change. If describing the change takes more than a single card, subsequent continuation cards are punched starting in column five.

After the description comes an optional blank card, followed by the card defining the programmer and date. This is done by punching hyphens in columns 2 and 3; the card has the following overall format:

  -- Joe Coder <joe@coder.com>  Thu, 30 Apr 2020 13:16:43 -0700

Note that the e-mail address must be in angle brackets, and there must be two blanks separating it from the date, which must be punched in RFC2822 format (the same as reported by the date -R command).

Macintosh

A Macintosh application (technically, an “application bundle”) is actually a directory whose name ends in .app and which contains but a single subdirectory, Contents. That subdirectory in turn must contain a number of files:

Contents/Info.plist
This is an XML document in a specific form, which is described here. If you are developing a Java application, see “Info.plist Java Notes” below. Apple provides a command-line program, /usr/libexec/PlistBuddy, which can be useful when generating or reading this file.
Contents/MacOS/something
This is the executable file for your program. It is OK for it to be an executable script in one of the standard MacOS scripting languages (e.g. a bash script). Its name much match whatever something you chose to associate with the CFBundleExecutable key in Info.plist.
Contents/PkgInfo
I have not been able to find much information on this file, but I believe it helps the Mac associate this application with certain file types. In the general case (no special associations required) it suffices to set the contents of this file to APPL???? .
Contents/Resources/something.icns
The application’s icons. See “Preparing Icons” below for more details on how to create this file. This file’s name must match whatever something you chose to associate with the CFBundleIconFile key in Info.plist.

That’s it for the mandatory contents. Any directory whose name ends in .app and contains the above structure should be recognized as a clickable application by the Macintosh. It is, of course, common for applications to contain read-only data, which is also contained inside the app bundle. For example, .jar and .class files for Java applications can be stored in Contents/Java. Applications can also use non-reserved keys in Info.plist to store configuration information and other data.

Info.plist Java Notes

Apple has a standard way of storing Java-specific information in Info.plist, under a Java element in the top-level dictionary. Unfortunately, using it will cause a Mac to attempt to run the app using a very old, Apple-customized Java runtime that isn’t even present on most Macs. Your users will see a dialog with a bunch of blather about the “legacy Java runtime” being needed, and even if they follow Apple’s suggestion and download that runtime, it is likely that won’t be able to run your application, because it is so obsolete.

Therefore, as Groucho Marx said about the doctor’s response to the patient who complained “It hurts when I do this,” don’t do that. I follow the Oracle convention of putting the entry-point class in JVMMainClassName, a package-relative path to my Jar file in JVMClassPath, a Java version specification in JVMVersion, and any extra options to pass to the Java environment in a JVMOptions array.

Actually, it doesn’t much matter. I could have come up with unique keys of my own (so long as they didn’t clash with any official Apple ones), and it would have worked just as well. MacOS is blissfully unaware of the significance of those JVM… tags, and simply ignores them. They are meaningful only to my defined CFBundleExecutable script, which has been coded to look in Info.plist for some of its options.

Preparing Icons

Mac application icons are stored in .icns files. These are actually a collection of multiple icons, defined for a variety of sizes. To create such a file, you must create a directory whose name ends in .iconset, and populate it with PNG images containing your icon in various sizes, as described here. Then use iconutil to generate a .icns file; assuming your directory was named name.iconset, you would type:

iconutil -c icns name

Windows

A clickable app on Windows is simply an executable (.EXE) file that contains an embedded icon which Windows will recognize and display.

If you create a jar file and set Main-Class in its manifest to point to the class containing your application’s entry point, Windows considers it to be an “executable jar” and will launch your application when you click on the jar file. That’s almost as good as a having a proper executable with an embedded icon, but it doesn’t have an embedded icon, so your app will display using the default icon that all jar files get.

The solution is to install Launch4j and use it to create a Windows executable with your icon of choice embedded in it. This is a free program, and I have found it to be well-documented.

If your application is sufficiently complex as to require a bunch of support files under Windows, then you will need to create a Microsoft Installer file. My apps have so far been simple enough not to require this, so I don’t have much help to offer in this regard yet.

Creating an Executable JAR for Unix and Linux

Published at 20:46 on 21 March 2022

One of the annoying things about any JVM language is that to run the result of compiling your code, you have to type something like:

java -cp somefile.jar domain.name.name2.SomeClass arg1 arg2 …

Or at best:

java -jar somefile.jar arg1 arg2 …

Wouldn’t it be great if you could just type the command name followed by arguments, like you can do with a compiled C or C++ program? The normal way to do this is to write a shell script and make it executable, but this is a tad clunky (now there are two files, the shell script and the JAR that it invokes). It would be nicer to have just a single executable.

Well, you can!

echo "#!/usr/bin/env java -jar" > somename
cat somefile.jar >> somename
chmod +x somename

And that is it! You now have an executable binary that is an archive of Java bytecode instead of native machine code. (Of course, it requires a suitable java interpreter to exist on your PATH.)

Best of all, while all of this sounds hackish, it is not just luck that a JAR file with some leading junk tacked on to it is still treated as a valid JAR file. No, this is basically guaranteed to work. You see, JAR files contain their header data at the end, not the beginning, and Java simply ignores all data earlier than what is described in the header.

And since the Macintosh is just a UNIX system under the hood, this trick works for Macs.

Revisiting the Eclipse IDE

Published at 12:21 on 7 January 2022

It’s the officially recommended IDE of choice where I work, so I decide to give Eclipse another try, despite my history of bad experiences with it.

Fairly early on, it hangs. Hard. I kill it, and relaunch. Eclipse proudly announces its workspace is now corrupted, and exits.

So I use IDEA (the allowed alternate) instead. As a bonus, it is more familiar to me, due to sharing a code base with Android Studio. A few days later, I learn that’s what most developers use here. The official encouragement to use Eclipse is mostly a show to keep licensing costs for IDEA down.

Finally Cut the Cord to Leaseweb

Published at 20:12 on 23 October 2021

The subject says it all. Yesterday, I finally cut the cord to Leaseweb. I had moved off their servers some months ago, but were still using their DNS resolution services. As of yesterday, no more.

Leaseweb is neither no better nor no worse than most shared hosting services. It shares the same obnoxious feature of all of them, namely, a laughably (as in at least five and typically closer to ten years behind the state of the art) obsolete software platform. Unless you want to run the most vanilla PHP-based frameworks (and even those typically plead with you to upgrade, which you can’t, because you don’t control that aspect of your service), forget it.

In my experience, if you want the freedom to be the least bit creative, you really need at least your own virtual host, i.e. bare metal or emulated bare metal where you have absolute control over all software from the operating system on up. Anything else leaves you hostage to someone else’s ambition, or should I say the lack of it. Why should they upgrade anything merely for your sake?

So long as their creaky old shared hosts can run a semi-recent version of WordPress, they don’t care. And apparently, neither do most of their customers. Most of them are probably only faintly aware that Python or Java frameworks, or newer PHP frameworks, even exist.

It is made all the worse by how painful (and thus costly to the service provider) it can be to upgrade Linux by a major revision. The last time I tried, doing that, the upgrader made such a mess of things that I ended up wiping it all and starting again from a blank slate.

For all these reasons, shared hosting just seems to inevitably trend downmarket.

Careful on Social Media

Published at 14:57 on 6 October 2021

I have no disagreement with the contention that today’s social media monopilies represent a menace to society. Indeed they do.

Rather, I wish to make a word of caution about the response to this menace. If Congress is not careful, it might well do something that makes it easier for the Federal government to micromanage pretty much any firm (or individual) with an Internet presence. Pair that with the likelihood of a second Trump term, and you have a disaster in the making: a fascist government being handed a new tool for repression of political speech.

Whatever is done about the likes of Facebook and Twitter needs to be done in a way that keeps the government out of the job of deciding what sort of speech gets allowed online.

So Much Stupidity

Published at 09:56 on 30 July 2021

Trees Are Stupid

Not the living beings (those are amazing), the data structures. The things my undergrad CS teachers were obsessed with assigning tedious programming exercises to implement.

I am reviewing how to pass those stupid coding tests most interviewers seem to be so fond of these days, and one if the things that has become clear is just why I never much liked trees in the first place. In short, if you use trees, you are virtually always stuck with two choices: a simple, logical, easy-to-understand tree that is vulnerable to pathological behavior, or a complex, quirky, difficult-to-understand one that literally makes buggy code inevitable.

Yet my professors pissed away so much time blathering about trees and how to code them. How many times have I used, I mean actually used, such knowledge professionally? Maybe once or twice in my decades-long career. Not surprising, given their many disadvantages.

Consider associative arrays (a.k.a. dictionaries or hash maps), which all modern programming languages support to some degree. These are versatile general-purpose data structures that eliminate much low-level grunt work. A huge part of the reason I so seldom use trees is that I just use associative arrays instead.

Ah, but don’t these use trees under the hood? No, generally they do not. They use hashing functions, arrays, and linked lists. Why? Because you typically get faster access that way, and they are simpler to code (and therefore have fewer chances of implementation bugs), that’s why. The people who code language runtimes and standard libraries are not stupid.

Yes, even in what is alleged to be the canonical “trees excel at this” case, they actually don’t. Not really.

Sure, trees have some genuine uses in databases. But here’s the thing about databases: very few people write them. This is because few people need to write them. The rest of us simply use them, and there is already a very nice set of databases (well debugged) out there ready to be used. Why re-invent the wheel, particularly when it would involve so much tedious, bug-prone coding?

So it’s not that trees are useless, it’s just that they are far less useful than their prominence in the undergraduate computer science curriculum would indicate.

Why are they so common, then? One must consider the general purpose of higher education: to furnish to the economic system individuals that are screened and graded for both intelligence and obedience to authority. Seeing who is motivated to perform meaningless, tedious programming exercises is a great way to do this. As a bonus, many of the more sophisticated tree structures have the advantage of requiring algorithms that are both non-trivial enough to give students a good coding exercise yet trivial enough for them to be expected to code in the first place, thus making them a good source of busywork.

Job Interview Coding Tests Are Stupid

They are stupid precisely because they inevitably want you to show off your tree-coding knowledge. But as we have seen, in the real world, you don’t directly touch trees. You use a database or a hash map, and be done with it. This goes for a surprising amount of the generalized stuff they teach you in college, in fact.

Seeing the world in terms of general-purpose principles can often be positively harmful in programming. Some of the most useful code I have written for people is code that was proclaimed by others to be impossible to write, because it was an instance of a generally impossible problem to efficiently solve. While this was true in the general case, a little introspection into the particular instance of the problem at hand would reveal it had specific attributes that made a solution possible. The general case would break my code, but that was irrelevant because my code was not running against the general case. A solution was possible.

In one case, there was so much resistance from others in the team that I had to sneak my code into the system. It was only some weeks later that I observed how things were “mysteriously” behaving better, and then pointed out the reason to my incredulous teammates, who were at that point compelled to concede that the problem was solvable after all.

Success in real-world coding is based on being able to determine the unique and specific characteristics of a particular problem, how these differ from the more general cases of problems, and how to leverage these particulars to craft specific solutions that are significantly more efficient than any stock algorithms or data structures could ever be.

I will make an exception here for the coding test that my passing led me to get hired at the best job I have ever worked at (at least for the initial several years, until both the job and the company changed to the point that I was no longer well-suited to the position). It was crafted to have such realistic special characteristics, and it was my quick spotting of these that impressed my interviewers.

But that was the exception that proved the general rule: interview coding tests are stupid.

Done?

Published at 11:16 on 6 July 2021

Is the process of cutover to my new hosting solution (i.e. self-hosted) done? We shall see.

One wrinkle is that my self-hosted email server seems to be DNS blackholed. Hopefully I can resolve that. This is a virtual host, and the IP address it possesses may have been used by an incautious or abusive site in the past. Unfortunately, it is not possible for me to preserve my old, known-reputable IP address. This is yet another instance of a problem where abusive Internet users cause headaches for the vast majority of non-abusive users.

Update. Almost done, it turns out. The emails from the new server are being rejected by both Apple and Google, because my new static IP address is for some reason on a blacklist. Guilty until proven innocent, oh joy. Now I must argue to have my address un-blacklisted. Mostly I blame spammers and not Apple or Google; I have used such blacklists myself in the past and may well do so again in the future. Abusers of the Internet have ruined so much of it for honest users.

Done?

Published at 19:03 on 19 June 2021

I might (finally) be (almost) done with this blog upgrade. We shall see.

I will say that WordPress does not make it easy to move a blog to a new server. There is a defined processs for purportedly doing this. Pity it does not work very well:

  1. It has a rather silly 2 megabyte limit, one for which the recommended process does not remove. I eventually found a post from another user who had fought the same battle that explained how to do it, but wait, there’s more.
  2. The pathetic P.O.S. does not preserve permalinks! Congratulations, every internal link to another blog post is now broken.
  3. For some reason, it gratuitously replaces paragraph breaks with line breaks. WTF?

This is more than a little bit disappointing, because one would expect WordPress to be able to properly import from another site running the exact same software. I’m lucky, because I’m a computer geek. My solution was to go into the MySQL command prompt and do some database surgery:

  1. Make sure the two blogs are running the exact same version of WordPress.
  2. Drop all existing tables.
  3. Restore from a database dump taken on the old system.
  4. Go into the wp_options table and set siteurl and home to reflect the blog URL on the new host.

The average user, however, would be S.O.L. Most people don’t even know what SQL is, let alone know enough of both it and general database design principles to be able to engage in the sort of hackery I just did. They would be stuck not being able to move their blog, stuck with a broken, damaged new blog, or stuck with the lengthy and painful job of repairing their damaged new blog completely by hand.

Ubuntu 20.04 LTS: Installing MySQL

Published at 17:56 on 15 June 2021

This is probably going to be part of a series about the curve balls Ubuntu 20.04 LTS throws at the veteran Ubuntu Linux user.

When you install MySQL with:

apt-get install mysql-client mysql-server

You will get an oddly-configured MySQL server that uses a newfangled thing called auth_socket authentication for the root user. The upshot is that you will not be able to log in to mysql as root unless you are already the root Linux user, and in the latter case you will always be able to log in, regardless of what password you supply, or even if you supply a password at all.

If, like me, you are logged in as the root Linux user (and why wouldn’t you be, if you are doing a system install), then it appears as if authentication is completely disabled, and your mysql server’s root account is wide-open. At that point, you will try doing Internet searches to uncover the cause of the problem, and if you are like me, you will spend hours trying different keyword variations and finding exactly nothing pertinent.

The fix is to change the root user to use caching_sha2_password authentication and set a password for it, e.g.:

ALTER USER 'root'@'localhost' IDENTIFIED WITH 'caching_sha2_password' BY 'iSpQ7U9c8kGz';
FLUSH PRIVILEGES;

(And no, that is not my actual root password.)