Spreadsheets Suck, Here’s Why
Published at 23:31 on 2 April 2024
It’s the math.
Specifically, they all (at least all the leading ones: Microsoft Excel, Apple Numbers, and Libre Office Calc) use floating point numbers and arithmetic.
Fractional radix digits are only capable of accurately representing numbers whose prime factors contain only factors of the radix. For the sort of base 10 numbers we are familiar with, this means that any fraction whose denominator can be represented as a product of 2’s and 5’s can be represented. As an example, 8 factors to 2³, so eighths can be represented with complete accuracy as decimal fractions. It takes three digits, of course, because 10³ (2³ ✕ 5³) is the lowest power of 10 that is an even multiple of 8, but you can do it. And really, three digits isn’t that bad.
If you use a denominator that cannot so be represented, then you get an infinitely-long repeating fractional part. The canonical example of this is ⅓ turning into 0.3333333….
But computers use base 2, not base 10, and this creates a problem. 2 is itself prime, so fractional radix digits in binary notation can only represent denominators that are powers of 2 accurately, and nothing else. Everything else turns into a number with an infinitely-long repeating fractional part.
This is a big problem, because one of the most common uses of spreadsheets is financial calculations, and floating point number can only represent monetary amounts as small as 25 cents accurately. If a financial quantity does not end in .00, .25, .50, or .75, your spreadsheet is representing it wrong! Only slightly wrong, of course, but still wrong. And if you are adding and subtracting enough numbers together, eventually the result will be wrong by a penny or two.
It is for this reasons that banks use decimal arithmetic, not a processor’s built-in floating point arithmetic, for their financial calculations. They don’t want their customers’ balances to drift from reality by a few pennies per year. Banks have done this since just about forever. COBOL, one of the oldest high-level programming languages out there, and designed for business computing, uses decimal arithmetic by default, and this is why.
The rationale for spreadsheets not doing likewise is for “performance” reasons, but frankly, that is a load of horse hockey. Yes, built-in floating point calculations are faster. But the performance hit from using decimal arithmetic is far from a deal-killer. COBOL dates from around 1960, when computers had only a tiny fraction of the computing power they do today, yet COBOL programs ran just fine way back then, and cranked out accurate results without gratuitous rounding errors. (Plus, your average spreadsheet is a lot smaller than your average batch of bank transactions to process.)
I was going to make more use of spreadsheets in figuring my income taxes this year, but after learning the above I am mostly sticking with good old dc, which uses decimal arithmetic. (Actually, it uses base 100, but when it comes to avoiding rounding errors, base 100 works identically to base 10, since the latter is a power of the former.)