The Spreadsheet Gap

The essence of the computer’s power in the modern age is the ability to take a previously time-consuming task, and to automate it. Often not just automate, but speed up, and reduce the chance of errors. For decades now this power has been open to all thanks, mainly, to the existence of the spreadsheet. From early data entry systems on mainframes through VisiCalc, Lotus and of course Microsoft Excel then onward to modern web based tools like Google Sheets & Airtable the spreadsheet has granted far more people the ability to record data & process it - magnitudes more than the number of computer programmers out there.

In recent years the spreadsheet has been joined by it’s distributed equivalent, the API toolsets of If That then This, Zapier, Segment etc. that promise to stitch together sets of proprietary tools via their individual interfaces into powerful distributed automation systems for businesses. But for every company that’s discarded their custom software in favour of getting smart non-programmers to stitch together a set of general purpose software, there’s a business creaking under the weight of managing a spreadsheet, trapped between its complexity and the high cost of building it out as custom software.

I call this dissonance between general purpose software and custom software the spreadsheet gap

Ever since the first general purpose software tools were released there’s been a steady convergence of what one could accomplish with a team of programmers versus an off-the-shelf package. Programming has, of course, got easier over the years - new languages have meant programmers can think more about business logic than memory management and CPU cycles; cloud computing has removed the overhead of systems administration and of building & maintaining physical hardware.

At the same time in the general purpose world spreadsheets went from basic lists of numbers to custom formulas, sorting, filtering, linking data between sheets, custom interfaces, VBA macros. The last decade has seen the spreadsheet move online, allowing all of this to be shared, and collaborated on in real time.

The flexibility in custom software comes from its ability to do literally anything you can program a computer to do (which is a lot, and which we don’t really know the limits of even now) but its downfall comes in that every new thing needs to be worked out afresh, developed by someone with a specific set of knowledge of an ever changing landscape. Once written it needs to be maintained, monitored, patched for security holes or bugs. Conversely a very moderate amount of technical knowledge can allow someone to create a powerful tool for organising or automating a business, in a relatively short space of time.

Big balls of mud

However from ease of use comes ease of creating a big ball of mud. In software as languages got more flexible and tried to fix more things silently for the programmer, more bugs crept in - PHP & JavaScript are thought to be some of the easiest languages to start developing in, so true to form they contain the greatest number of pitfalls waiting to derail what seemed to be a completely good piece of software. Indeed, both languages have gradually adopted more guardrails in the form of unit testing, type checking, and static analysis (PHP as part of its own evolution, JavaScript by virtue of the TypeScript sub-language).

The same is true of the general purpose tool - as its name suggests you can do anything logically possible, and that means you can do pretty much every possible wrong thing as well as the narrow range of possible right things. It’s possible to run an entire business on a spreadsheet yes, but it’s also possible to delete that spreadsheet with a single command run by most users on a machine or network. It’s also possible for that entire spreadsheet to have been designed by someone who leaves the business, taking their hard won knowledge with them in the process. Without a lot of consideration during the design and build of the spreadsheet - and who tends to upfront that work? - there can literally be a system that nobody has a chance of reliably working out, a black box that we have to trust because the business has no other choice.

This article isn’t drawn from a hypothetical. During my first startup, building Ground Control Skydiving I migrated a number of skydiving centres away from business management tools built in Excel (or Access). My software often had to consider supporting functionality provided by these tools, which was generally hard to quantify as even when the person who wrote the spreadsheet was still there, they weren’t necessarily sure why that field had that formula - it just did and the business seemed to work because of it. Migrating data ranged from some awkward copy and paste (avoiding inbuilt formulas) through to genuinely impossible because there was no structure to the data beyond “I know that every 5th row in column C might contain a date”.

As an aside, general purpose software is rarely fit for handling sensitive data, as evidenced by the time I was sent a spreadsheet to import that contained not just the details of a list of skydive customers, but their full credit card information as well (insecure at best; possibly in violation of the law, or at least their card provider terms).

Programmer is king?

So for someone who presumes everything should be done using general purpose tools I have a big clod of examples to wave as to why this is a bad idea. But is it really sensible to write custom software for everything?

Software can bypass a lot of the dangers of general purpose tools - code can be written in a way that is “self documenting”, with the code actually explaining what it does. Much like newer general purpose tools allow stitching together various established pieces of software, so code can be built on top of other code, well established frameworks to do things like calculate monetary values so that the individual developer doesn’t need to become an expert in maths to build a system. Sources like GitHub have allowed this to become the de-facto way of writing software - find the library that does the task, and the programmer writes “glue code” to get one part of the system to talk to another part.

Of course none of these are guarantees - whilst there is a slightly higher bar to entry for software development, it’s not much higher and there aren’t a huge number of notable qualifications someone can wave that say they can actually make software. It’s possible to write code that is pretty inscrutable - I once had to debug a piece of software written in Bulgarian, which I’m sure was fine at the time of writing but certainly limited the company (who were not Bulgarian) in who they hired to work on it next.

The Gap

This brings us (at last) to the Spreadsheet Gap - the point at which the spreadsheet is too big, too complex, hard to maintain, at risk of a single staff member moving on, in need of more users working with its data simultaneously etc. Invariably though it’s also the point at which, despite the ongoing costs and creeping inertia of said spreadsheet, it’s still a damn sight more expensive to hire developer(s) or an agency to build the same tool in custom software.

The insidious nature of the Gap gets worse though, because whilst standing at this wall of a cost increase, the wall is growing. It’s growing because the business likely hasn’t stopped generating data that’s populating the spreadsheet (and thus increasing the amount of data that eventually needs to be migrated), and it likely hasn’t stopped evolving or expanding what it does (thus requiring new work to be done to accommodate this in the existing spreadsheet). So the longer the business waits the bigger the cost.

Ultimately what will eventually push them is likely the inertia of their current system - where it’s so big it can’t change at all, or a catastrophic failure (a hidden formula bug makes them realise long term figures were off and they’re in debt, for example). Neither of these are particularly good times to embark on trying to distill all the acquired business knowledge of the general purpose system into a custom built piece of software.

I should add, it’s also often a decent leap prior to the resulting tool being useful for anything other than the very specific business case for which it’s been designed. I’ve been approached twice in the last year by people who had built fairly well accomplished spreadseheets for managing organisations (one a community organisation, one a small business) who wanted to take their idea (codified as a spreadsheet) and turn it into a SaaS platform - providing their own tool as custom software, but with enough flexibility for other organisations to use it, and in doing so make the originators some money. In both cases however I had to point out that the flexibility of the spreadsheet system that would, for example, allow them to easily pivot their spreadsheets would add a magnitude of complexity to a software build - enough to put the cost way beyond the casual “make a bit of extra money on the side” and into the “this is now what we do full time” sphere.

Bridging the Gap

I wrote this article because of thinking about these various cases - of people wanting to turn their clever spreadsheets into a business (and my assessment that it would’t work), of people running businesses on them (and my painful attempts to move them off) and of conversations on social media about the position of developers in businesses of the future.

In some avenues of technology we have seen dramatic shifts away from once widespread professions - the role of a sysadmin used to mean thousands of people managing thousands of servers, but cloud systems, virtualisation, improved fault tolerance has reduced this to tens of people managing thousands of servers.

Will we see the same pattern as improved general purpose tools allow more non-developers to link software platforms together, or will the role of developers just move away even more from the low level of does this value compare with this value and become about managing the irreducible logic problems of handling what that system says versus what this system says.

From where I am standing (with the heavy bias of being a professional software developer) I can see enough blockers to writing software becoming an “everyone” thing that I feel this profession can still support more newcomers as well as evolve for those already in it. At the same time general purpose tools likely need to learn some keys to maintainability from the software world - version control, automated testing, code review - this might allow more entry level technical tasks to be more broadly accomplished without sacrificing reliability.

Who really knows, but it will be interesting to see if or when the Spreadsheet Gap gets bridged.

Adendum

10th March 2021

After writing this article I had a discussion with someone whose own particular route to bridging the Spreadsheet Gap was currently hovering around Wordpress, the PHP-based Content Management System. Originally conceived as a blogging tool, Wordpress modified its internal data structure to allow users to specify an effectively unlimited number of different data storage types in a single application. Through both visual user interfaces and custom plugins Wordpress can therefore be stretched beyond a blog platform. Similar CMS systems such as Drupal offer even more flexibility.

Despite this flexibility I tend to find CMS-based applications can too often end up on both sides of the Spreadsheet Gap, whilst lacking the helpful bridge in between. Without a developer on-hand they are often still difficult to hammer into doing exactly what you need, stacking plugins on plugins and hoping everything is compatible. With a developer on hand they can be developed to do far more but can still become unmanageable, as the developer will often themselves be trying to stitch plugins together to avoid doing a lot of actual development.

I am sure there are examples that prove me wrong, and with the right team of software developers a decent CMS or framework can do a lot of heavy lifting, but they still require specific expertise rather than a generalised skillset.