The COVID-19 pandemic has turned the Zoom video conferencing app into a household name, no longer just the preserve of business people who ask if they can “send you a calendar”. The sudden transformation of social life across much of the world (at least the western part anyway) into an online only activity places a much-loved tool into a tumultuous new market, where customers have different desires, expectations and understanding.
Needless to say it has not been a smooth ride for them.
There have been plenty of opinion pieces on the matter; I imagine I’ll not bring anything new to the table. But Zoom right now provides an interesting study into the mechanics of success for a technology startup, and shows how important assumptions are when building a product - especially when considering how the security posture of such a product interacts both with its development and with its adoption.
- Zoombombing - a practice that has sprung up whereby people attempt to randomly enter the Zoom meetings of others, generally for fun or sometimes to cause harm or offence.
- An older (2019) issue that allowed a call to be initiated without user confirmation, with video enabled by default.
- Confusion around the “attention tracking” feature, designed to let a host know if you’re not actively using the Zoom application whilst on a call.
- Questions around use of the term “end to end encryption” where it turned out that the “end to end” part wasn’t entirely truthful, in some cases.
- Accidental data collection (sending user data on iOS to Facebook).
- Terms of service which made hints that Zoom may use meeting content for analytics and marketing purposes.
Zoom have taken steps to remediate these issues by modifying default settings, removing features and amending terms, but this isn’t likely to be the end of it. So how does a company that’s worth so much (market cap ~$30 billion), trusted by so many businesses make such serious mistakes that come to light so quickly after a spike in usage?
A common phrase to hear around business startups is to “validate assumptions” - it takes account of the fact that every person involved in creating a business will have assumptions about how their business interacts with the world around it. If I build a business assuming people will pay $100 for a burger cooked in the back of my car, I’m going to have a nasty shock and go bust pretty quickly. Generally the more audacious the business plan the more you need to validate - but the bigger the payout of successfully doing so. Validating assumptions is also not just for proving your business as it stands will work - you may find your assumptions are wrong but spot another opportunity with what you’ve learned.
Zoom have been successful in the business world largely because they made one major assumption: if you solve the key problems of putting people into an online meeting, they don’t really care how you got there. The problem domain has been tackled by many companies previously (Skype, Vonage, GoToMeeting, Powownow, Google in various guises) and it is somewhat surprising on the face of it that there wasn’t already a king in this space before Zoom (founded 2011, IPO 2019). But that belies the complexities around video calling:
- A range of devices, from personal computers (Windows, Mac, Linux), smart phones, tablets, internet connected TVs, desk conference phones
- A range of connectivity - WiFi, 3/4/5g, Ethernet, phone lines, satellite and no doubt others. Add to this corporate networks and firewalls
- A vast range of different technological understanding by users
- The need to get two people to connect on the platform via communications which are, by definition, not on the platform
A phone call is comparatively easy. Phones were able to standardize a restricted yet basic protocol for voice before the internet democratised communications. However when everybody is free to determine their own way of doing things, convincing two people who want to speak to each other to use your platform is hard. Even if one does, the other might not, and your platform just lost value.
Attempting to find simple answers to hard challenges generally requires something be left out, and in Zoom’s case that nearly always seems to have been security.
Zoombombing has become the practice of typing random 9 digit meeting codes into Zoom and hoping to crash a meeting. 9 digit codes are Zoom’s attempt to solve the “how do we find each other” problem of organising a meeting. Where some platforms have URLs with passwords, others might require two people to “add” one another (possibly with an approval step) as contacts, phone services have meeting IDs and PIN numbers, Zoom went basic. Give someone a 9 digit number, short enough to be memorable, easy to type, easy to verify and they can connect to you.
The security assumption hidden here is that 9 digits is still enough complexity to prevent somebody randomly stumbling across your meeting. 9 numbers gives you 1,000,000,000 meetings to be happening at any one time, which given it’s 1/7th of the planet’s population seems like a lot. However this assumption doesn’t stand up to scrutiny.
For a start, assuming exactly one Zoom was happening in the world, the number of attempts needed to find it isn’t 1,000,000,000, it’s 500,000,000 assuming a random distribution of guesses. But in Zoom’s case there isn’t just 1 meeting happening, there are tens, hundreds of thousands or even millions. So my chance of finding your meeting is still very low, but with a million meetings it becomes reasonable to stumble upon one without a prohibitively high number of guesses. And now we have Zoombombing.
This isn’t part of the recent raft of problems but already proved that Zoom planned to solve their business challenges without recourse to good security practice where required. Ask anybody whether they want a piece of software on their computer that automatically turns on their web-cam and they’ll say “of course not”.
But ask those same people again if they’re tired of video meetings that start late and half the people have sound or video problems and they’ll say “of course”. Well turns out you can solve your calling woes by just answering automatically and turning on video too!
Security is often a trade off with convenience, and in this case Zoom went all out on the convenience at the expense of sensible security. Furthermore it’s a failure of user education, as nobody would agree to this given a free choice.
The attention tracking feature is actually quite interesting. If I’m using Zoom to share a plan with my team, how do I know they’re not all just browsing Reddit? In this case I think the failure was communication but also the new arena Zoom found itself in. If you work on a corporate machine on a corporate network you’re probably already being spied on by your employer anyway; and if you have any reports you’ll love the idea of people not being able to ignore you!
However put this in a public context and let the internet rumour machine get to work and a feature that tells someone if you’re watching their screen or not quickly becomes “the host can see exactly what you’re doing on your phone” and the feature gets taken out of the app. Public opinion shifts quickly.
I’ve written about encryption before if you want to know what “end to end” is in detail. But the simple answer is, “end to end” means only the people at each end of the call have the ability to listen to it. Nobody carrying the signal (ISP, corporate networks, Zoom themselves) can listen in.
However Zoom is trying to solve connectivity problems here. Some people can’t get online but have a regular phone, and some people want meetings recorded. To win the market Zoom had to cater to these niches, but if you’re not talking between compatible software clients that can carry out cryptographic key exchanges you can’t do end to end encryption. More interestingly, if you want a phone participant to listen in, even the people using the software can’t use end to end encryption, because Zoom have to put some software in the middle that translates to a regular telephone call.
The assumption being made here is a fairly ignorant one - that people won’t question encryption as long as you mention the keywords “end to end”.
These two are pretty simple to explain - one way a company gets to the top is to move quickly, and that often means lack of care around what exactly the application might be doing or what we’ve written over there. Most likely somebody added Facebook tracking as a test, the team decided against using it long term and then moved on without properly removing it. The terms were potentially insidious, but more likely just generic wording - a business communication company would be insane to actually sell user communication data for marketing when it has a customer base like Zoom’s.
The idea of startups moving fast and breaking things really should have died a long time ago. What’s interesting is that inherent flaws in a business product run by companies with IT departments, technical staff, compliance and regulations should go unnoticed until the software is put in front of the general public. This suggests that general awareness of the importance of privacy and security is growing, and hopefully the next company that sees a shortcut to the top of their market by discarding sensible security practice will think again.