NVIDIA has released a video demonstrating how an ordinary video can be transformed into a smooth and natural slow-motion video through artificial intelligence. The effect is like a slow-motion special effect shown on a movie, where the viewer can clearly see a series of continuous slow motion. So how is this effect achieved? Let’s explore the technical support behind it.
Slow motion, not as simple as you think
I believe we have seen slow-motion effects in various films and TV works, for example, in the just concluded World Cup there are many players scoring, shooting, body contact slow-motion playback, through the VAR frame-by-frame playback, these slow-motion can clearly see what happened in a moment (Figure 1).
Figure 1 VAR frame-by-frame playback in the World Cup
For the slow-motion effect in the movie, it is actually with the help of equipment for high-speed photography, such as shooting speed of 50 fps, 100 fps or even higher, and then in playback still choose 24 fps regular speed playback, which is equivalent to the actual 1 second shooting images with more than 2 seconds to 4 seconds playback, so as to achieve the slow-motion effect.
Of course, for ordinary users, we do not have high-speed photography equipment, so how to achieve the slow-motion effect? NVIDIA recently launched cuDNN accelerated PyTorch deep learning framework to achieve any video slow motion, through this artificial intelligence framework, combined with NVIDIA Tesla V100 GPU powerful processing power, it can stretch any video, so as to achieve similar to the slow motion in the movie special effects (Figure 2).
Figure 2 NVIDIA demonstrates the moment a tennis racket strikes colored ink
Behind the video elongation – artificial intelligence slow motion technology
Through the above introduction we know that the conventional slow motion is achieved by playing the video of high-speed photography at a low speed. So for ordinary video (already low-speed photography finished), how does NVIDIA achieve the slow-motion effect?
The core of slow motion is to stretch the original video to achieve the slow speed effect, but if the ordinary video is played directly using the low speed effect, the actual effect will become stuttering, the action between the frames become incoherent. Therefore, the ordinary video will be stretched to achieve a smooth slow-motion effect, which requires the positioning of the video object and the complementary frames.
For example, if a video of a car drifting, if we want to realize the slow-motion demonstration of drifting, we need to locate the car in the video first, only after the accurate positioning of the car, such as the precise location of the car drifting every second, so that we can show the whole subsequent drifting action of the car (Figure 3).
Figure 3 Drifting car
After the positioning of the displayed object is achieved, we also need to frame the video because it is a slow-motion effect. Since the original video was shot at low speed, now the video is stretched, so that the stretched video does not lag (drop frames), it is necessary to make a precise frame fill to make the video play very smoothly even after it is stretched (Figure 4).
Figure 4 NVIDIA shows slow-motion video of a dancer dancing
In this way, NVIDIA’s artificial intelligence framework technology enables slow motion of any video through video positioning and frame filling. So how is this effect achieved?
NVIDIA’s technology is made possible by NVIDIA Tesla V100 GPU’s powerful video processing power + AI learning framework. After NVIDIA has built the AI learning framework, about 11,000 video clips prepared in advance are used as data sources for the AI to learn, so that it can learn positioning and frame filling from these video clips. For example, in the above dancing video, the AI technology can locate the dancer in the video and can learn each frame of the dancer’s action to know what state the character is displayed in the next frame. In this way, through certain algorithms and learning models, and through the deep learning and self-learning of AI, this AI framework can perform the same positioning and decomposition for other videos, thus realizing the slow motion of ordinary videos through perfect positioning and frame-completion techniques (Figure 5).
Figure 5 NVIDIA shows a person jumping from a height to crush a balloon instantly in slow motion
Of course it’s not just video slow motion, with new training methods that allow the AI to generate new images from existing images, or even new portraits using different people. For example, similar to the face-swapping effects in the film. NVIDIA’s AI framework can accurately achieve the effect of seamlessly switching characters from one face to another (Figure 6).
Figure 6 NVIDIA show face changing effects
Slow motion, bring us more fun in life
Through the above demonstration, we have seen the powerful capabilities of NVIDIA AI framework in video processing. The emergence of this technology can bring a lot of fun to our life.
With the popularity of cell phones, we use them to shoot short videos, and for those fleeting images, we always want to see the whole process clearly. For example, the mother who likes to dance square dance, for teammates, the coach’s fast dance rhythm is always not clear to see the whole action, now just use the phone to shoot, and then with the help of Nvidia this technology transformation, dancers and then fast action can be slowed down, so that the mother carefully look at each dance action.
For example, NVIDIA’s face-changing technology allows us to make more funny videos on the phone, such as turning your roommate into a cute cat, and then share it with your friends through WeChat and friends. Of course these technologies can also make our video processing easier, such as editing piano teacher playing piano action, easy for us to learn fingering, editing player shooting video, let us slowly enjoy shooting fun!