Software-based video stabilization has evolved rapidly in the last several years—and it turns out that’s a great thing. Because what started out largely as an effort to make smartphone video look better for consumers has progressed into a need for steady video communication among distributed workers of all types, driven in part by new form factors and a rise in remote work.
You see, as the Covid-19 pandemic enters its third year, enterprises continue to adapt and exploit technology solutions to get work done, whether it’s virtual conferences, telehealth, e-commerce, or something else. Early on, for millions of knowledge workers, video communication became critical to ongoing operations. First, because they all were working from home, then, to foster collaboration in so-called “hybrid” work situations when not everyone could be together in-person and some had to be patched in via video.
But hybrid work means something different to industries like construction or manufacturing, where the work is distinctly non-digital and most workers don’t have a laptop and webcam to collaborate. Still, there’s a need for video communication in the field or on the shop floor, if you will, in part to support social distancing, and in part to connect distributed experts who might work remotely from each other even in non-pandemic times.
For example, analysts have observed a significant rise in the adoption of augmented reality (AR) headsets to facilitate connected work. Last year, ABI Research trumpeted the growth of AR usage in industrial markets — manufacturing, transportation/automotive, energy and utilities, and logistics as well as architecture, engineering, and construction (AEC). Using AR and video-enabled mobile devices, as well as a burgeoning selection of smart glasses and headsets, industrial companies are empowering a whole new classification of worker to collaborate visually, whether it’s a hands-free video call from a work site, visually documenting inspections, recording processes for training or reference, or simply patching in a remote expert to consult on a situation that both parties can see at the same time.
But none of this can work well without video stabilization. A construction worker or utility inspector with a video-enabled headset is no steady tripod. They move, shake, cough — all of which make the video they’re capturing impossible to view accurately. For remote experts who need to “see” what field workers are seeing, jumpy video is a recipe for nausea.
Evolution of software-based video stabilization
The good news is, we’re entering a new generation of software-based video stabilization that can support whatever mobile video uses cases are thrown at us. In fact, we’re reaching a point where stabilization can be so good, it’s almost unnatural. At that point, we can start to reintroduce motion as appropriate and as it supports the ultimate use case. Imagine a surgeon with a video headset streaming an operation. We want to stabilize the effects of her head moving, for example, but not the heartbeat of the patient.
Software video stabilization started as image-based, using optical flow methods to compare video frames, detect the motion of pixels from one to the other, and compensate for that flow to produce more stable video. The results were good but required more compute power than early mobile devices could afford.
As smartphones began including gyroscopes, stabilization evolved to be gyro-based, rather than image-based. Stabilization algorithms didn’t need to read actual images from a camera sensor to understand how motion was affecting the video. The tricky part was correlating the motion detected in gyro data to the actual motion of the camera sensor, which required a theoretical model of the camera system to ascertain real-world conditions. The stabilization software could then send information to the camera’s rendering device on how to adjust and crop frames to stabilize the video. Gyro-based stabilization is very good and power-efficient, making it ideal for use in mobile devices like smartphones and AR headsets.
And for the most part, this is where we are today. But if we combine the two stabilization paradigms—image- and gyro-based—we can achieve even better, sub-pixel stabilization that future applications may require and benefit from. As camera resolutions increase, and, for instance, as industrial visualization applications exploit those higher resolutions, video stabilization will have its work cut out for it and ultimately provide the foundation for application success.
There are today software-based video stabilization solutions that overlay image-based stabilization on gyro-based stabilization, but they are compute-intensive. Flagship mobile devices that offer both are often designed to fall back to gyro-based at very high resolutions, like 8K. But the time is coming quickly when a smartphone or headset, with faster, dedicated processors, can perform the most advanced video stabilization available.
That becomes the canvas on which we paint new, advanced video applications. As more and better video cameras enter the workforce for new and different uses, video stabilization will be critical — to a point. But the future of mobile video isn’t more stabilized; it’s better stabilized. And the way to get there is to achieve super stable video through software—video so steady it may appear unnatural—and then allow or reintroduce motion using artificial intelligence.
In other words, as video and AR devices permeate industries and require advanced stabilization, there will inevitably be a need to discern unwanted motion from wanted motion. Not all camera bumps, shakes, and sweeps are bad, and this new-generation technology senses the difference. Call it smart video stabilization or intent-based video stabilization, it’s all possible without consuming power.
This better stabilization is possible now. It’s application to myriad use cases is also coming into better view.
Johan Svensson is chief technology officer of Imint Image Intelligence AB, inventors of Vidhance video enhancement software.