Computational Challenges aka How to Eat An Elephant
How do you eat an elephant? One bite at a time.
Apologies to the vegetarian readers, but it’s a metaphor which serves to illustrate the necessity of decomposing an impossible task into small “bites”.
We wrestled with an elephant-sized challenge of this sort in the early days of FloodMapp - how do you provide street level detail, at national scale; all while keeping processing times to under an hour? One bite at a time.
In practice this meant rather than trying to run a model that was 1.8 million square kilometres, we would break that model up into smaller bites. Cloud computing offers you the ability to run massive amounts of work in parallel. The trick with parallelisation is each bite or task has to be independent. So if a task depends on the next and the next, then those tasks are serial and not suited to be parallelised.
Unfortunately for us, a river network is a linear system where changes in hydraulic behaviour propagate downstream. Rather than be overwhelmed by a huge geographical region, we approach this by breaking off regions within a network and then synchronising them based around a known value.
FloodMapp pulls in data from thousands of different river stage sensors which tell us how high the water is in absolute terms at a specific place and time. It’s these stations that we use to help us break up a large geographical area (the elephant) into smaller bites.
Each station is paired upstream and downstream making up a small model domain which we can run in parallel, so we can run each of the thousands of models once every hour. By making the largest region thousands of times smaller, we can run physical models at the required scale in parallel. Now the overall speed of the system is limited only by the single largest bite, everything else happens at the same time.
That said, the benefits of parallelisation don’t come for free. There is a trade-off which lies in attaining speed at the expense of complexity. But arguably, this is a cost that is needed to be borne. A single model will run more slowly, especially when it isn’t competing with anything else for resources. But in the end, you are left with just one result.
Parallelising the problem means you have thousands of models running at once, all of these needing memory, disk space and CPU. Moreover the models will be different: in size, in shape, in location. In an ideal world these models would have absolutely everything included, but again, there’s no such thing as a free lunch.
Cloud computing is flexible and empowering, but it bills by the millisecond. The more you add in, the more computational costs add up. Over provisioning models is akin to wasting water down the drain. And in the world of startups, there are no scraps left on the lunch table. We are lean, and we are hungry to get the most of our models.
We are incredibly proud of what our team has built over the past four years. It has been a massive undertaking to build thousands of models that are running continually, but more than that, we have a framework that orchestrates each of them and ensures they have access to the most recent sensor observations. And all of this is run in a cost-effective manner.
Some days it feels like with each step forward, there is a longer road ahead. We are moving fast, making incredible progress and still, there is a long way to go. But we do what we do because we know it makes a difference, and the more areas we can map, the bigger impact we can make.