Job Board Enshittification

Published at 09:45 on 12 June 2026

About two months ago, I started experimenting with scraping online job listings. The main motive was just to save myself frustration; I have long been disappointed with the quality of the search results from such boards. Why should I waste time wading through a bunch of crap I am, given my search specifications, self-evidently uninterested in?

The scraper gathers the texts of recent job listings from a variety of different online job boards into a data file. A separate postprocessing step then reads through that data file, doing some rudimentary natural language processing on the texts of the job listings therein. Basically, it does keyword searches, giving points for matches, and bonus points for combinations of matches I view as signs a job is likely to be particularly interesting. Then it sorts the listings by the resulting scores and presents them to me.

And indeed (pun unintended), it is not unusual for some jobs to end up with a score of zero. A complete goose egg. The only way this is possible is if none of the keywords I am searching for match.

Another thing my scraper does is filter out things it has seen before. I keep track of the primary keys the site uses to store the listings in their database, and when I last saw them. (These primary keys are easy enough to deduce, because they inevitably show up in the HTML link to each job details page.) Today Indeed proudly returned a job I had last seen in the middle of last month as its first result, despite my request to sort the results in descending chronological order.

Why do this? Why show listings that have nothing whatsoever to do with my search terms? Why show some old listings I saw a month ago to me again? Because money. Indeed is obviously charging its clients extra for these services, so their ads get better reach. It may be bad for job seekers like me, but it is good for their own bottom line. This is textbook enshittification, wherein market forces reward businesses for offering worse services.

Have I mentioned yet that a dozen or so years ago, you didn’t have to do anything backhanded to scrape jobs from Indeed? They had a public API whose purpose was to make it easy for a program to download and filter job listings. They got rid of that, of course, because it is contrary to a business model that now includes shoving old crap or irrelevant crap at users in return for money.

And it’s not just Indeed. It’s most of them. The only job board that seems completely immune from this trend, in fact, is Job Bank, which is a public service run by the Government of Canada. They offer a public API, too. Let capitalism fans and their mythology of private enterprise always offering superior products put that in their pipe and smoke it.

Avalonia Redux

Published at 08:30 on 27 May 2026

After using Avalonia for one project, I can say that it is basically what I thought it was, with one exception.

My previous deadline for the wxPython web site coming back up to something approaching full health blew by, so Avalonia it was. As I knew going in, it’s a little more clumsy and boilerplate-y than I like. The resulting apps don’t look as slick and polished as native ones (because they don’t use native widgets). But it works well enough, and it’s popular enough I don’t have big worries about it turning into abandonware.

The exception was multithreading. Avalonia supports async/await, or as Microsoft calls it task-based asynchronous programming. Because of course it does. Async/await got its start with .NET languages. So it’s been part of C# for a long time, and that feature has been fully absorbed into frameworks developed for that programming environment.

This ends up making Avalonia a clear winner over Qt and its quirky signals and slots and its poorly-implemented memory management. It even makes it a clear winner over the callback-based multithreading that wxPython offers. Async/await effectively compiles under the hood to callbacks, but it lets the programmer write code much more as if it were traditional blocking code, which makes coding far easier.

Signals and slots were a significant step backwards over Swing’s callbacks. Async/await is a significant step forward. What seemed to be a narrow win now looks like a clear win.

The things that make Avalonia suck still suck. All GUI frameworks suck. Avalonia simply sucks less than the alternatives for my purposes, significantly less.

And yeah, I know: the wxPython web site now seems to be back up. Well, so what? It took most of a fortnight to come back to full health. That speaks loudly as to how little momentum there currently is behind that platform.

Evaluating wxPython

Published at 15:26 on 20 May 2026

So, anyhow, I settled on wxPython as my Python GUI framework of choice.

One would think that, given its generally superior programmer interface (it is not even close), wxWidgets and wxPython came after Qt and PySide. Surprisingly, the exact opposite is the case.

My explanation for the mystery is a mixture of groupthink and functionality. Having to continually think about memory management just comes with the territory in C++. So the Qt team, being C++ programmers, didn’t think much of it when designing their Python bindings. It’s all just a natural part of programming, right? Add to that how important smart phones have become, and how Qt supports those (but wxWidgets does not), and you have your answer right there.

Unfortunately, while wxPython is free from the worst braindamage that pervades PySide and PyQt, the package, despite being around for a while and despite having a fairly recent release, doesn’t seem to be very popular and may be on its way to becoming abandonware. What makes me suspect this is that late last Thursday, the wxPython web site started acting up, and as of my typing this is basically completely down. On top of that, I’ve run into a clear bug in the underlying wxWidgets code that has been sitting around unfixed since last February.

Maybe they’re just having a spot of rough luck, but my feeling is that a popular project with lots of active development would probably have a backup site going by now. If they manage to restore service by Friday afternoon, I will probably keep the faith (and even offer to contribute my own efforts), but things are on hold until then.

If it doesn’t come back up, it will probably be time to chalk up another win for .NET. Avalonia certainly has its issues, but they don’t seem to rise to a Qt level of obnoxiousness, and I don’t want to produce something that’s basically obsolete the instant I finish coding it.

Evaluating Qt

Published at 13:18 on 14 May 2026

I had high hopes for PySide6, the dominant Python binding to the Qt framework. From what I had read, it was clearly the most popular way to do GUI programming in Python, well-supported by an active dev team, and a little checking up revealed this impression was accurate.

All went well until it came time to do some I/O-and-compute-intensive background work. For some reason, the Qt dev team didn’t stop at GUI development; they went on to develop a whole universe that encompasses more than just graphical user interfaces. As an example, instead of using standard Python thread pools, all documentation on worker threads in Qt focuses on using QThreadPool. I was hoping to find some explanation on why things are done this way, and why doing it the standard Python way was or was not a good idea, but I had no luck. Not willing to run risks, QThreadPool it was.

Thankfully, it wasn’t bad. It’s just a typical thread pool that one submits jobs too. The problem happened when my jobs were done and wanted to return some results for the UI thread to display.

In normal UI frameworks, that is simplicity itself. There is a call to submit code for execution in the front-end event loop, you submit a closure containing the result and how to process it, exit the background job, and you are done.

Qt is not normal. It has these quirky relics of the early/mid 90’s attempts to manage concurrency called signals and slots and it expects you to use them everywhere. That includes worker jobs that need to notify the UI event loop of something.

It turns out that signals and slots are seriously broken by design. They don’t work unless both sender and receiver objects remain active until the signal is received by the slot. Any departure from this requirement and things will randomly fail. Sometimes they fail visibly with an exception (often an oddball one out of left field that doesn’t directly signify the cause of the issue). Sometimes signals will just get silently dropped on the floor.

So, no problem, just keep the sender active. But I’m sending the signal to indicate that I am done! Why should I remain active? My work is done, it is time to exit. It is an onerous requirement, one that I don’t have to bother with in any other framework of which I am aware. In wxPython, Swing, and Avalonia, I just submit the code to do the front-end processing to the event loop and exit. In Qt, I have to wait around for the front end to message me back that it has received my results and I may now exit.

And signals themselves are bizarre and un-Pythonic. You create them by defining class variables of type Signal, but you never use them as such. Instead you create instances of that class and Qt does some backhanded, behind-the-scenes hackery to transmute those class variables into instance variables of type SignalInstance. Just one more sharp edge for the programmer to cut themselves on.

It gets worse. It turns out that this is but a particular instance of Qt in Python being for the most part a thin veneer over a library written in C++. There is little or no communication between how C++ manages memory and how Python manages it. Create a control, add it to a parent control, but don’t store it in a global variable or as an instance variable in a long-lived Python object, and Python might garbage-collect it even though Qt still has a C++ pointer to the same memory region. Qt programmers have to follow all sorts of arbitrary rules to avoid memory management issues.

Use Qt and you will be painfully aware of memory management all the time. This cuts to the prime reason I want to use a language like Python, which features automatic memory management, in the first place! Forcing this sort of experience onto the programmer is positively asinine.

This is not a minor nitpick, as it gets in the way of how I like to program. I like to be dynamic and reactive. Just create UI elements on the fly and display them. The logic that manages and displays them will (or, rather, should) hold references to them, keeping them from being garbage collected, until they get closed or removed. Then they vanish, in memory as well as on screen. The number of elements naturally expands and contracts according to the needs of the user. The only constraint is the amount of hardware memory, and possibly quotas on same imposed by the system. Larger systems with more memory can naturally be taken advantage of, as needed. Simple, logical, effective. And needlessly painful to accomplish in Qt.

And this seems to be an only-in-Qt thing. In wxPython, for example, the dev team thought about this, and explicitly coded their glue to the C++ world to avoid these issues. (It’s not very hard to do.)

Qt seems to be the answer to the question: “I want the worst of both worlds. Is there any way I can program in a language as slow as Python and still have the memory-management headaches of C++ to deal with?”

To hell with Qt.

Evaluating Avalonia

Published at 09:49 on 13 May 2026

The suckiness that is the Java build environment finally got to me. That corrupt JAR file (the second time this has happened to me) was the final straw. I should not have to debug common build tools and the files they make. They should just work. The only bugs I should routinely battle with are the ones I just wrote in my own code. That is quite enough work, thank you very much.

So, despite Java in many ways being the best way to write a portable graphical application for the desktop, those JCA’s just are too much. The final straw had broken the camel’s back.

I had sort of been biased against Avalonia since learning that it focuses on pixel-for-pixel consistency between platforms (something I believe is a big mistake for desktop application programming). Then again, ASP.NET Core is so much better than any Python alternative web framework and Avalonia uses a similar code-behind model. Maybe the pixel-for-pixel compatibility doesn’t result in something that bad after all?

Executive summary: The rendering is better than I thought, but it doesn’t matter. There are enough other reasons not to use Avalonia.

File dialogs. In Swing (Java), Qt, and wxWidgets, one can customize these by adding extra controls. In Avalonia, you get the standard file dialog and that is it, no provision for customization. Want to ask for more parameters? Pop up a second dialog. The problem is, that is often a losing user experience; many load and save operations just naturally lend themselves to a need for a little extra information. Better to handle it all in one step, unless one is asking for a lot of extra information (and one usually is not).

File filters in dialogs. In Java and wxWidgets, you filter files by extension. In Qt, you filter by extension or MIME type. In Avalonia, you filter by extension on Windows and Linux. Linux also offers the option of filtering by MIME type. On the Mac, neither of the above work; you must filter on this wonky thing called an “Apple uniform identifier type.” Hello? The whole point of a cross-platform framework is to smooth this sort of stuff out! I shouldn’t have to deal with any weird Apple-specific stuff like this at all; it should just be MIME types and/or extensions everywhere. So stupid and annoying.

Data types in the clipboard. On most frameworks (Java Swing and AWT, Qt, wxWidgets) these are normalized from system-to-system. On Avalonia, the story is similar to file frameworks: one way for Linux and Windows, another way (those “Apple uniform identifier types” again) on Macs.

XAML. This is the declarative, layout-describing language that one codes behind. Unfortunately, unlike Razor templates in ASP.NET (which are powerful and have access to basically all C# control structures), XAML is pretty weak-sauce. It’s a garden-variety XML schema, with little or no provision for imperative coding. This is a pretty big problem, as cross-platform GUI programming often necessitates imperative coding. Consider pull-down menus; the Mac organizes them quite differently than do Windows or Linux. So a user-friendly app will include conditional logic and adapt itself accordingly. This is not exactly easy in XAML. You have to use some brand-new (and poorly-documented) features, some of which have some truly obnoxious misfeatures. The end result ends up being a lot more verbose and repetitive than it needs to be.

MVVM. Avalonia rams the model-view-view-model paradigm down your throat. It’s possible to resist this, but you have to work at it. MVVM was an effort to manage complexity in very large and complex programs. Not all programs are very large and complex (and fewer programs should be), so not all programs need MVVM. Even the inventor of MVVM openly admits this. Worse, the flavour of MVVM that Avalonia pushes is cartoonishly rigid, formulaic, and doctrinaire. Every view gets its own separate view model, even though it often makes more sense to share a view model amongst cooperating views. Even worse yet, by pushing something that makes sense for the largest programs, one pushes the mindset that computer programs should by default be large and complex. This mindset is positively harmful, as it leads to feature bloat.

The documentation. XAML is all standard controls classes under the hood, so one doesn’t have to use it. One can just use those controls classes directly, in C#. Then one gets the full suite of comprehensive imperative programming features C# has. Unfortunately, the documentation on how to do this is extremely incomplete. It’s mostly focused on how to do things in XAML. What documentation there is tends to be something of a mess. Complex classes, with many dozens of properties and methods, and nothing in alphabetical order. Why? Just why? There is simply no excuse for this lack of organization.

The choice of what programming language to use is often dictated by frameworks and libraries, and not the core features of the language itself. So it is here. Everything other than Avalonia in C# is generally worse than Avalonia. So C# simply doesn’t make sense for this project.

It is the exact converse of the situation with back-end Web programming. This time, it is Python that has clearly the best framework.

Why? Culture. Group identification. Avalonia was written by people that came out of the .NET world and wanted to make .NET applications more portable. It was strongly patterned after Windows frameworks like WPF. User-friendly GUI programs have never precisely been Microsoft’s strongest suit; Windows is rightly regarded as inferior to the Mac when it comes to user friendliness. It just turns out that that externally-visible awkwardness is there in no small part due to internal awkwardness bubbling up to the surface. And because the Windows universe is really big, most of those in it are astoundingly ignorant of what other universes have done.

When it came to Web servers, Microsoft had been left behind. They had to do a good job on ASP.NET, else nobody would switch over from the various open source frameworks that had already dominated the landscape. When it comes to the desktop, the exact opposite was the case. Complacency tends to produce inferior outcomes.

That said, I still might use Avalonia someday. If, that is, I ever have occasion to develop another smartphone app. I would want it to be portable, and the main alternatives to Avalonia for such things are Electron and Qt. The former is Javascript-based, and the way Javascript so badly botched modules and imports is reason enough to eschew the language. Qt is, well, Qt, and I will get into the specifics of that particular flavour of awfulness soon.

Portable GUI Frameworks

Published at 22:01 on 11 May 2026

I’ve been experimenting with them, because I have a need to rewrite a tool I use from Kotlin into some other programming language. Not because of any real defects in Kotlin (despite its proximity to the Java world and its associated JCA’s, this is not a show-stopper).

No, the problem is that I need to read HEIC files, and there is no good, easy way to do this in Kotlin, because there are simply no good, open-source, comprehensive, general-purpose imaging processing libraries available for the JVM. There are limited libraries that do not support HEIC nor the large size of the HEIC images I wish to read (such as the built-in stuff in the standard Java class library). There are esoteric libraries like ImageJ that support all sorts of oddball image formats used in the microscopy and health care fields, but not HEIC. There is no shortage of libraries that are abandonware and haven’t had an update in a decade or more.

By contrast, C# has Magick.NET and Python has Pillow, both of which do what I want. So I have been looking into GUI frameworks for those languages. Lets just say there are a lot of bad ones and leave it at that for now. I will post some details on the badness later (some of it is pretty mind-blowingly bad).

Suffice it to say that at this point I am reasonably sure wxPython will suit my needs, and it took a disgustingly long period of trial-and-error to get to that conclusion.

Bypassing Cloudflare to Scrape a Web Site

Published at 21:49 on 15 April 2026

It’s supposed to be really hard, and Cloudflare does indeed to a very good of detecting (and banning) web scrapers. But, it turns out that this is one of those things that, while indeed very difficult to do in the general case, is actually quite simple in a lot of specific cases.

The main trick is to use Playwright (with the playwright_stealth addon) to control a browser, and to use that browser to scrape the web. You could theoretically use some other browser automating tool like Selenium to do this, but the problem with Selenium is that it basically advertises itself with every request (and there is no way to turn this off), and that triggers Cloudflare in short order.

The second trick is to mimic a human user as faithfully as possible. Generally, that means going slow, i.e. no faster than a human would navigate a web page, and inserting randomness into the process so it looks like a human and not an automaton is controlling the browser. This is where things break down for abusive scrapers like AI companies; they can’t take it slow, because if they do, it will literally take millennia to get as much data as they want. But I am not an AI company, and only need to scrape a modest amount of data, so taking it slow is good enough (I just let it run, and the results get collected eventually, not as fast as they might otherwise have been, but it still beats manual cutting and pasting).

The reason I mention this is that, if you try searching on how to do this, the results tend to be pretty useless. They are dominated by either a) techniques that no longer work and which will trigger Cloudflare almost instantly, or b) commercial scraping services interested in taking your money. The latter may actually be a useful service if you want to scrape a lot, but I am not in that category.

That all of this is even necessary to write about is yet another data point in the overall enshittification of the Internet, which by my reckoning peaked in its usefulness circa 2010 and has been heading downhill ever since. The services I now have to use devious techniques to scrape used to be easily scrapable; actually, in many cases, they had free API’s designed to work well with another computer and didn’t need to be scraped at all.

And yes, the abusive scrapers are as much (or more) to blame here as are the sites removing functionality. I have had to implement anti-scraping measures on a site I host, because AI companies were ignoring robots.txt and scraping the daylights out of it. It was so bad that I was getting server crashes from the abuse.

Another Javascript Problem: Libraries

Published at 08:26 on 5 April 2026

One of the nice things about C# and Python is that both come with a robust, comprehensive standard library. Java, too: although its standard library tends to be awkward and clunky to use, it is still comprehensive and well documented.

Javascript? Not so much. As part of its heritage of originally being a relatively small, client-side language for embedded scripting, its built-in library is not nearly so comprehensive. One must instead go to npm for many things that are supported out of the box in many other programming languages. Thankfully, as a result of Javascript’s popularity, npm is very comprehensive.

Unfortunately, the quality of the libraries published there (and the quality of their documentation) is, let me just say, erratic. Plus, there’s just so many of them, and it is not always obvious which libraries are better-supported and higher-quality, often resulting a fair amount of trial and error before one finds a suitable solution. Then there is the module and import hell I have written of before; sometimes a given module is not available in the module and import configuration you are currently using, causing you to have to deal with all sorts of uniquely Javascript headaches that no decent programming language has.

In short, if you write an application in Javascript, you’re not going to be able to find reasonable, well-documented default library solutions as often as you would were you using a better-designed, better-planned programming language. Finding and using the right libraries is going to be a source of recurring headaches.

What I Hate about Java

Published at 12:32 on 12 February 2026

Consider a common programming task: open a text file for reading with buffering. Let’s go through some of the programming languages I have used, in rough order of my learning them. (Disclaimer: my memory is a little rusty on some of these; they may not all be 100% correct. But they are not that far off the mark.)

First, the non-Java languages.

BASIC-PLUS:
OPEN "FILE.TXT" FOR INPUT AS FILE #1%

FORTRAN:
OPEN(UNIT=1,FILE='FILE.TXT',STATUS='OLD')

Pascal:
assign(file1, 'file.txt');

C:
FILE *file1 = fopen("file.txt", "r");

Perl:
open(FILE1, '<file.txt');

Python:
file1 = open("file.txt", "r")

C#:
var file1 = new StreamReader("file.txt");

And then we have Java:
var file1 = new BufferedReader(new FileReader("file.txt"));

LOL, what? Why should those internals be exposed? Why should I have to explicitly wrap an unbuffered reader in a buffering one? Why the extra step to do something so common and routine? Why did I just have to spend a half hour studying the documentation, chasing from class to class to class, to figure out how to do something that was almost self-evident in every other language I was learning?

Why can’t Java do out-of-the-box today in one simple step what FORTRAN could do in 1966?

And don’t say “object orientation.” Python and C# are object-oriented, and don’t have this programmer-hostile silliness.

Sure, this seems to be a little thing, and it is just one thing. But it’s not really just one thing: this sort of crap is all over the map in the Java world. Everything is clunkier and more awkward than it should be, everywhere. It’s relentless. It’s wearing.

Re-Discovering the Advantages of .NET

Published at 09:06 on 28 January 2026

Fifteen or so years ago, as an exercise in curiosity, prompted by how often I saw the technology mentioned in job listings, I decided to check out Microsoft’s .NET framework. I was expecting to come away feeling smug about how much better competing technologies more popular in the Linux world were.

Surprise No. 1: I didn’t have to just read about it. .NET is an implementation of an open standard called the Common Language Runtime (CLR), and there was what turned out to be a very nice open source implementation of the CLR called Mono. Which I proceeded to install on my Mac and play with.

Surprise No. 2: It (both the C# programming language and the .NET framework) was well designed! This one floored me, given how sucky I generally find things that Microsoft has been heavily involved in. C#’s designers obviously learned from Java’s mistakes, particularly when it came to designing a standard library. And, frankly, they had to do a good job. Unlike its operating systems and desktop environments, which have long been market leaders, and could get away with coasting on their well-established momentum, Java was the clear market leader in virtual machines that ran byte-compiled code. If Microsoft didn’t do a good job, people would just stick with Java, which runs just fine on Windows.

I ended up writing a bunch of command-line utilities in C# and a web site using ASP.NET. It even led to a job where my history as an individual who knew both .NET and Linux servers was the special sauce that got me hired.

But that job didn’t last forever, and there was still a lot of anti-Microsoft tradition that caused most of the open source world to dismiss .NET and Mono out of hand. I could tell I was probably not going to luck out like that again, so I shelved .NET in favour of technologies more common in the open source universe.

Fast forward 15 years and Microsoft has now open-sourced .NET and merged its codebase with that of Mono, meaning the two formerly separate projects are now effectively one.

I have been struggling in the past few days with how to integrate authentication into a web app I am writing. Rolling your own is generally frowned upon (it’s surprisingly complicated; you have to deal with sign-ups, account deletions, forgotten password resets, perhaps two-factor authentication, etc.) But the off-the-shelf solutions available for Python or Node.js just plain suck.

Mainly, they don’t have the flexibility I need. You see, I need access to the actual password used to log in, because I am using it to derive an encryption (and decryption) key used to protect sensitive per-user data in my database. One of my web app’s selling points will be that even I won’t be able to know your secret data. Most authentication services and libraries simply don’t support this: you never see the user’s password, because you don’t prompt for it yourself.

So I check out what sort of authentication systems the .NET world has to offer, and immediately find one that doesn’t suck: one of its key design principles is in fact to let their clients do the prompting for authentication credentials, because, guess what? They just might want access to them, themselves. Cluefulness, what a concept.

Then I find out that I don’t need that product at all, because ASP.NET comes with a surprisingly capable identity management system built in. Which, while it doesn’t let you do your own prompting for credentials by default, does offer it as an option.

Database access is better, too. Most open source object-relational managers (ORM’s) are flat-out terrible. They force you to code all sorts of repetitive boilerplate to mirror what’s already in your database schema*. Instead of simple, logical, expressive SQL, you have to use awkward and clunky chains of method invocations. It’s bad enough that I’ve written my own ORM for Python. It wasn’t that hard, and it’s a whole lot nicer to use.

* How utterly asinine this is becomes clear when one realizes that one of the key characteristics of a relational database is the ability to use queries to programmatically deduce the schema of an existing database. Most ORM’s are, in other words, forcing the programmer to do manually what they could do automatically themselves.

Well, the two most popular ORM’s in the .NET world, Dapper and Entity Framework, are both best of breed. They don’t suck. Entity Framework even has, with C#, query expressions as first-class language constructs.

Then we have file-based routing, where you create a new file and get a new route automatically, something that Apache did 30 years ago (and still does today) but many modern open-source frameworks (particularly in the Python universe) still can’t do. Another win.

Documentation is another big win. .NET has some of the best in the business. Nearly everything is covered by both tutorials and comprehensive API documentation, the latter of which is liberally supplied with examples. It’s not just documentation, either; there is all sorts of help for the programmer in the form of what the .NET world calls “scaffolding,” in which example code can be created for you on request. It’s almost always easier to do something by modifying existing code that comes close to what you want, rather than to start from a completely blank slate.

It’s just generally a better developer experience all around. Normally, you pay for convenience like this, typically in the form of poorer performance. Not this time: ASP.NET sits at the very top of web framework performance benchmarks.

It’s not all roses. .NET is arguably overengineered (just look at function parameters: you have normal parameters, out parameters, keyword parameters, ref parameters, and readonly ref parameters). And there’s at least four different ways to template and generate web pages in ASP.NET.

But while the overengineering is tiring at times, there’s still nothing as bad as the hideous shambolic mess that is the Javascript module and import system. And, arguably, it does make for a lot of choices, choices that I will be taking advantage of to develop exactly the sort of web application that I want.