Take That, iPhone!

Published at 15:01 on 11 April 2024

After a recent update, my iPhone started letting me control my headphone volume from threshold of pain loud to insane instant deafness loud. I guess some aging hipster who ruined his hearing going to too many rock concerts without hearing protection got appointed to a QC position at Apple.

Based on what I had read about human sound perception, I guessed I needed about 6 dB of attenuation to tame the thing. A simple matter of adding four resistors to the picture (two for each channel, one in series to cut the voltage in half, and another in parallel to restore the impedance the audio amplifier sees to what used to be pre-attenuator).

The worst part about it was all the fiddly soldering (those connectors have some tiny terminals). But it works, and 6 dB was indeed the correct amount of attenuation needed to restore sanity to the device.

Spreadsheets Suck, Here’s Why

Published at 23:31 on 2 April 2024

It’s the math.

Specifically, they all (at least all the leading ones: Microsoft Excel, Apple Numbers, and Libre Office Calc) use floating point numbers and arithmetic.

Fractional radix digits are only capable of accurately representing numbers whose prime factors contain only factors of the radix. For the sort of base 10 numbers we are familiar with, this means that any fraction whose denominator can be represented as a product of 2’s and 5’s can be represented. As an example, 8 factors to 2³, so eighths can be represented with complete accuracy as decimal fractions. It takes three digits, of course, because 10³ (2³ ✕ 5³) is the lowest power of 10 that is an even multiple of 8, but you can do it. And really, three digits isn’t that bad.

If you use a denominator that cannot so be represented, then you get an infinitely-long repeating fractional part. The canonical example of this is ⅓ turning into 0.3333333….

But computers use base 2, not base 10, and this creates a problem. 2 is itself prime, so fractional radix digits in binary notation can only represent denominators that are powers of 2 accurately, and nothing else. Everything else turns into a number with an infinitely-long repeating fractional part.

This is a big problem, because one of the most common uses of spreadsheets is financial calculations, and floating point number can only represent monetary amounts as small as 25 cents accurately. If a financial quantity does not end in .00, .25, .50, or .75, your spreadsheet is representing it wrong! Only slightly wrong, of course, but still wrong. And if you are adding and subtracting enough numbers together, eventually the result will be wrong by a penny or two.

It is for this reasons that banks use decimal arithmetic, not a processor’s built-in floating point arithmetic, for their financial calculations. They don’t want their customers’ balances to drift from reality by a few pennies per year. Banks have done this since just about forever. COBOL, one of the oldest high-level programming languages out there, and designed for business computing, uses decimal arithmetic by default, and this is why.

The rationale for spreadsheets not doing likewise is for “performance” reasons, but frankly, that is a load of horse hockey. Yes, built-in floating point calculations are faster. But the performance hit from using decimal arithmetic is far from a deal-killer. COBOL dates from around 1960, when computers had only a tiny fraction of the computing power they do today, yet COBOL programs ran just fine way back then, and cranked out accurate results without gratuitous rounding errors. (Plus, your average spreadsheet is a lot smaller than your average batch of bank transactions to process.)

I was going to make more use of spreadsheets in figuring my income taxes this year, but after learning the above I am mostly sticking with good old dc, which uses decimal arithmetic. (Actually, it uses base 100, but when it comes to avoiding rounding errors, base 100 works identically to base 10, since the latter is a power of the former.)

Why Swift Is Not My Favourite Programming Language

Published at 22:44 on 7 February 2024

It’s the libraries, stupid.

The standard Swift library is laughably in-comprehensive. Things you can do in the standard libraries for Java, Python, PHP, C#, Ruby, and most other common modern programming languages just aren’t in there.

What you are supposed to do, from what I gather, is to use the Apple Foundation framework. There are several problems with that:

  • The framework is a hot mess. It got its start back in the 1990’s as part of the NeXT operating system, and has been incrementally hacked on ever since. The documentation is likewise a mess: incomplete, cryptic, and poorly-organized. It is fully part of the pattern that Apple products tend to be as programmer-hostile as they are user-friendly.
  • The framework is still incomplete. Support for tasks as basic as doing buffered reads from an arbitrary text file on a line-by-line basis are absent from it. (At least I think they are absent; review the part about the documentation being a hot mess above.)
  • The framework is an Apple-only thing. There is an ongoing effort to open-source the Foundation framework so that Swift programs can be more portable, but it is a work in progress.

The bottom line is that Swift is not, in fact, the general-purpose programming language it is claimed to be. Unless one is writing native-mode GUI applications for Apple products, Swift really doesn’t make much sense.

It’s a shame, as the core Swift language looks to be fairly well-designed. It could be a great general-purpose programming language if only it came with a decent standard library. Alas, that’s a bit like saying Mr. and Mrs. Lincoln could have had an enjoyable evening at Ford’s Theatre if only that hadn’t happened.

I may eventually delve into Swift for such purposes, but as things currently stand, programming in Kotlin with the Java Swing platform allows me to develop GUI tools that run on my Mac, and I don’t have to deal with all the ugliness that is Apple’s native programming environment. Swing isn’t perfect, and its rough edges sometimes manifest, but it’s been good enough for my personal use.

XeTeX Redux

Published at 23:22 on 27 October 2023

It is a known bug. The workaround is to use fontspec to invoke the feature manually, e.g.:

\fontspec{Baskerville}[Renderer=OpenType, RawFeature={+smcp;-liga}]

Ligature substitution should be disabled when using small caps because the two features tend to be incompatible.

Python Set for World Domination?

Published at 18:03 on 22 October 2023

Python is already sitting at the top of the TIOBE Index of most popular programming languages, and has been for some time. And no wonder: it’s one of the best ones out there.

One big thing that stops it from being close to the best is that it has difficulty walking and chewing gum at the same time. In one project, this caused me no small amount of pain. It’s part of the reason I have used the Java virtual machine (usually via Kotlin, which is a more modern language than Java) on some of my projects.

Over the years, there have been numerous proposals to remove the global interpreter lock (GIL) from Python. These have generally gone nowhere, with the exception of the existence of Python versions that target the Java virtual machine and the .NET common language runtime*. There are a number of valid reasons for this.

But now, there is a very serious proposal to remove the GIL from the reference implementation, and the Python Steering Committee has indicated they will almost certainly accept it.

Once this happens, expect Python’s dominance to increase further.

* Alas, these implementations tend to lag (sometimes seriously) behind the reference implementation, plus they are not compatible with many of the third-party Python libraries out there. (The latter issue is also why it has been so difficult to remove the GIL, as doing so in ways that are both a) not ruinously inefficient and b) compatible with existing libraries has proven exceptionally difficult.)

XeTeX (Modern TeX) Disappoints

Published at 15:27 on 21 October 2023

I held considerable hope for XeTeX (the modern-day TeX). Alas, while arguably better than Groff, it still leaves a lot to be desired in the font department.

Namely, while it can indeed load and use the same standard system fonts that all other programs can (a big win over classic TeX), its support for OpenType font features is quite limited and lacking. For instance, small capitals don’t work. I have tried to use them multiple ways, including directly via RawFeature=+smcp, and either nothing happens and I get normal mixed case, or I get a complaint that Font shape `TU/Baskerville(0)/m/sc' undefined and again I get normal mixed case.

I know the smcp (i.e. set lowercase input in small caps) feature is present in the font I am using, because I can use it from Libre Office. Apparently, XeTeX just doesn’t get enough use for this sort of thing to get adequately exercised. At this stage, I’m getting to the point of writing the TeX family of text formatters off as hopelessly yesteryear.

The feature works in some other fonts I have installed, so apparently it’s a bug that only affects certain cases. Unfortunately, one of those cases is in the font I most wish to use for this project.

And yes, I know I could extract the small caps myself and create an .otf file whose lowercase is small caps, and load that font. Well, eff that. Not having to do such awkward hackery was my whole motive for installing XeTeX in the first place. If you say your tool supports standard font files, it should support the standard features in those files directly.

It’s a disappointment, as the well-documented plain-text input of formatters like Groff and LaTeX make them quite useful for formatting automatically-generated output. Modern font support has long been, and apparently continues to be, their Achilles’ heel.

Probably the Textbook Example of Why Java Sucks

Published at 00:22 on 14 October 2023

Probably the textbook example of why Java sucks is the Log4Shell Bug in the log4j library.

It’s just a library and not part of the core Java language, but the core Java language really isn’t the problem. The problem is the dysfunctional culture that surrounds the language.

So the stock logging is both bloated and yet surprisingly limited in functionality, because that is the sort of code the Java community tends to produce. That creates the need for an expanded logging system. Since it fills a void, everyone uses it.

Since the community does not appreciate simplicity, what the hey, let’s stick a general-purpose templating language into the thing. The same community wrote that general-purpose templating language, so what the hey, let’s stick a shell escape with command-output substitution into the templating language. Both the general-purpose templating and the shell-escape feature of the templating are of course enabled by default, because what the hey, why not?

And suddenly, we have a logging library with a shell escape in it.

Just how bad this all is, is underscored by how it sat unnoticed for eight years before it started being exploited. That’s right, so few people actually used all this creeping feature-ism that it took nearly a decade for the vulnerability in it to be discovered!

Look, logging and templating are two different domains. Templating and launching commands in a subprocess are two different domains. A logging package has no business containing templating beyond approximately the String.format level. A templating package has no business containing general-purpose subprocess creation.

If you really have a need to generate a super-long, super-complex log message, long enough and complex enough to require a general-purpose templating system, you should probably rethink what you are doing. Log messages should be relatively short. Now, maybe your case is 1-in-1,000 to 1-in-1,000,000 special and there really is a good excuse for a long message. Maybe. Fine. Do it by hand. Pull in a templating library, use it to generate that message, and feed it to the logger. Don’t bloat up and complicate the code base for everyone else just because of your 1-in-1,000 (or more) special case.

Likewise, if you really need to substitute in the output from a system command into your template, use the subprocess-creation features of Java to run the command, collect its output, and feed it into the template. Again, a 1-in-1,000 (or more) special case, no need to clutter up the code base.

In short, it’s potentially dangerous. Make the programmer work a bit in order to do it. Not a whole lot (libraries should and do exist to make it easier), but a bit. Enough to get the programmer hopefully thinking about the consequences of the feature. Don’t just stick it in by default.

But who am I kidding? This is the Java community we are talking about. Too much can never be enough!

Java Community Antipatterns, an Ongoing Series

Published at 17:27 on 1 September 2023

To give you an idea of the general pathetic hilarity of the situation, I was reviewing some code at work today. It reads in a message from Kafka, obtains a validator object, and calls that object’s isValid() method on the message it receives. That method in return a ValidationResult object, whose valid() method is then called inside an if statement.

This immediately strikes me as odd. When you validate something, it either turns out to be valid or invalid. That’s it. Two options, no more, no less. Yes/no. Black/white. On/off. There is no need to create a new data type to represent a validation result, because a perfectly appropriate data type already exists, built in to Java: the Boolean. Just use that. Far simpler and cleaner.

Maybe the ValidationResult object does something special and has extended features beyond those of a Boolean? Yes, it has a message field! But wait, that field is never accessed. The only thing that is ever done with that object is to call the valid() method, whose purpose is to return the Boolean value that should have been used in the first place.

And what of the validator object? Its class definition is very simple, just one short method that makes some basic checks. If its argument turns out to be invalid, the message part of the result is set to the string “Data is not valid.” No, I am not making this up. Of course the data is not valid, you moron! That is why the valid flag is set to false! This field conveys exactly zero meaningful information.

What other code uses this validating logic? None of it, it turns out! So there was no need for the validator class, either. Could have just added a private isValid() method inside the one (short) source file where this logic is used. Would have been a whole lot clearer, because the person reading the code wouldn’t have had to open another file to determine just what the validation logic is.

So three classes, and three source files, are being used where just one would have sufficed.

Now, this was a particularly egregious example, but this sort of crap-ola happens over and over (and over) again in Java code. Needless complexity everywhere.

Cutting Over to LaTeX

Published at 15:15 on 20 August 2023

The above use of StUdLy CaPs courtesy of a community somewhat enamored of them. An overly-cute quirk that for a long time made me shy away from that document preparation system. It’s silly, to be sure, and hardly the No. 1 reason, which is that I had by that time:

  • Already had learned troff, which provides the same general functionality,
  • Often I did not have access to a laser printer, and troff shares an input language with nroff, which can produce passable output on a simpler and at the time much more common typewriter-like printer. TeX and LaTeX, by contrast, are useless if you don’t have access to a laser printer, phototypesetter, or graphics display.

Time passes. LaTeX grows to be way more popular than poor old troff. Users deveop and share all sorts of macros to do just about anything you want. Those same users have many online forums that can be searched if you get stuck or puzzled. troff gains nothing equivalent. Get frustrated buy troff‘s inability to set text that wraps around illustrations (a standard book publishing technique).

So yeah, it was time. The final push that made spend most of a day reading Knuth’s definitive description of his program was my resumed job search, the desire to make a résumé that breaks a few of the rules (using some of my favourite fonts), and the difficulty of getting good font support for troff. By contrast, there are modern versions of TeX that can read standard font files.

(Yes, I know there are modern versions of troff that can apparently read standard font files. The rub is, they lack certain extensions to standard troff in the version I have been using, extensions that I am making use of, so I would have to rewrite anyhow. Plus. troff still can’t format output by wrapping around illustrations. Why not rewrite in the more powerful and full-featured alternative?)

Anyhow, I’ve now learned enough of LaTeX to make it do what I want to do… for now, which was the goal.

The Go Programming Language: Too Minimalist

Published at 18:55 on 15 August 2023

I first ran into this many years ago, when Go was first released by Google and I played with it a bit. I ran into a situation where a ternary (aka conditional) operator would be handy, only to discover that it doesn’t exist in Go. OK, then, maybe I can use an if statement as an expression? Sorry, in Go if is strictly a statement and not an expression. So I sighed, introduced an extra variable into my code that otherwise would not be needed, and coded an if / then / else statement.

Then earlier this year I run into an issue with Helm, which is written in Go. I was trying to use it to generate a YAML file from a template, and was getting tripped up because Helm was sometimes inserting newlines into the output. This can happen in Python, too, but there is a way to disable the feature. Surprisingly, there was no way to disable it in Helm, because Helm of course uses the Go YAML library, and that feature is absent there. You will get your gratuitous newlines, and you will shut up and learn to like it. Lovely.

Maybe those two experiences should have been sufficient to warn me, but no. I was determined to give Go another chance. So I decide to revisit the language by rewriting a Python program in it.

That program’s purpose is to detect and report characters and byte sequences in source code files that are likely to be troublesome. My motive for writing it was when I once wasted most of my day trying to find a Java bug caused by a Unicode zero-width space in a source file. So it ends up being a pretty good exerciser of a language’s character set conversion libraries.

Again, Go fell short in the feature department. One of the Unicode character classes is Cn, unassigned code points. Such things, being undefined in behavior by the Unicode standard, are nothing but trouble and so should be reported as troublesome. But wait! Go does have a library that defines Unicode character classes, but for some reason the Cn class is missing. It’s possible to code around this (with some lossage) so I do.

Then I run into another problem: detecting invalid byte sequences in an input file. Go simply replaces these with the Unicode replacement character, code point U+FFFD. But what if a source file has such a character in it, then what? In Go, it is impossible to tell if the character is actually there (and sometimes I might want to allow it, and let it silently be accepted), or if it was inserted by the appropriate encoding.Decoder method in response to encountering an invalid byte sequence (which should definitely always be reported).

In Python or Java (and I assume most other languages), it is no problem: one can tell the decoder to signal an error on an invalid byte sequence if returning replacement characters is somehow unacceptable. Not so in Go. You will get replacement characters, and you will shut up and learn to like it.

It’s still possible to write the program, of course. It just ends up being unnecessarily quirky, failing sometimes to detect unassigned characters, and raising false alarms about bad byte sequences if a file contains completely valid replacement characters in it.

But why? Why choose a programming language that breeds quirks and unexpected surprises? Such things are bad, whether they arise from excessive complexity or excessive minimalism.

And these are not the only examples I have run across (both in the last few days and earlier), either; they are merely the ones that fit best into the story I have related here.

It all makes me appreciate more just what a good job Guido did in striking a good compromise between complexity and minimalism in the Python programming language.