If you’re running a software company today, it’s almost a foregone conclusion that most or all of your apps will run in the cloud. Likely Amazon or Google’s. It’s hard to imagine that this wasn’t always the case, but there are still some late adopters migrating their own physical data centers into managed ones. And, as with all trends in technology, this too shall pass. Just when you were getting comfortable with containers and auto-scaling, a new architecture emerges, swinging the pendulum back to a truly distributed world.
A typical self-driving car generates up to 100MB of data per second from a combination of cameras, LIDARs, accelerometers and on-board computers. That data needs to be processed nearly instantly to keep the car on the road. With so much data to sift through, the current generation of cellular networks can’t keep up. By the time data arrives in the cloud, it will be too late. Instead, data needs to be processed as close to the sensors as possible, directly at the edge of networks, on the cars themselves.
Most of us aren’t building or riding in self-driving cars (yet), but there’s a good chance we’re already interacting with edge computing every day. Neural networks in smart speakers in almost 40 million American homes are listening for words like “Alexa,” “Siri” or “Google” and, according to Statista, 3 billion Snapchats are scanned for faces each day in order to add the addicting face filters. By the end of the year, 20 percent of smartphones globally will have hardware-accelerated machine learning capabilities.
All of these apps and devices are made possible by two major trends: advances in deep learning algorithms that help computers see, hear and understand and the proliferation of specialized processors like GPUs and TPUs that can run these algorithms efficiently, even in mobile environments.
Neural networks and deep learning aren’t new. In fact, the first artificial neural networks were created in the 1950s, and there have been multiple false starts since.This time, though, the abundance of labeled training data and compute power made it feasible to train these large models. Though AI research is still proceeding at a breakneck pace, fields like computer vision are starting to mature. Developers can choose from a variety of standardized model architectures, publicly available training data sets and tools. You no longer need a PhD just to get started. Technology is being democratized.
Tools and hardware are improving so quickly it’s hard to keep up.
Hardware is catching up, fast. Machine learning algorithms like neural networks are really just long sequences of matrix multiplications. Specialized processors like GPUs and newer neural processing units like those in Apple’s A11 Bionic chip and Google’s Tensor Processing Unit (TPU) are optimized for exactly these mathematical operations, offering 10-100x speedups over traditional CPUs while using less power overall. As major chip manufacturers roll out mobile-ready machine learning accelerators, every device will soon have the power to run the latest AI models.
Big data, data science, machine learning and now deep learning have been slowly weaving their way into products and companies for the past decade. Most of the time, this happened behind the scenes, up in the cloud. Data warehouses and analytics pipelines process records en masse. Results are made accessible to end users through APIs and database queries. That’s not going away, but the edge presents a new opportunity to use the predictive capabilities of machine learning models more quickly.
Now, the algorithms move to the data. Information is processed in real time, as soon as it’s captured by the sensor, and results are available immediately. In this latency-free world, entirely new user experiences are possible. Your phone’s screen becomes a portal to a world of augmented reality. Products can be personalized for a single user while private data never leaves the device. Applications become ambient and frictionless, anticipating questions and answering them before you ask.
When done right, experiences made with AI and edge computing feel like magic, but building them is incredibly complex. There is a divide between the tech stacks used to train and deploy machine learning models in the cloud and the ones used to build applications for edge devices, like smartphones and IoT. Neural networks can replace thousands of lines of procedural code, but fail in unexpected, silent ways and need to be tested differently. Performance issues that can be solved by simply adding more compute or memory from a near infinite cloud call for specialized optimization when they occur out on edge devices we don’t control. Even the programming languages preferred by the cloud are different than those running applications on mobile devices.
This is starting to change. Tools and hardware are improving so quickly it’s hard to keep up. Heavyweights like Apple and Google have made mobile machine learning frameworks (Core ML and TensorFlow Lite, respectively) centerpieces of their latest developer offerings. More export options and better interoperability are being added to tools like AWS’s SageMaker, Azure’s ML Studio and IBM’s Watson Studio weekly.