With the development of technology, the use of neural network technology to achieve keying and face replacement is nothing new. Last year, some foreign users used neural networks to refurbish old movies by “remaking” an old movie “Train to the Station” shot in 1895 into a “new movie” in 4K/60fps high quality (Figure 1). So how did the neural network achieve this function? And what kind of technology is used behind the scenes?
Fig. 1 “Train Approach” shot in 1895
Two indicators of video clarity and fluency – understanding resolution and frame rate
Those who often watch videos in Aiki and Youku know that if your Internet speed is fast, you can click on the picture quality switch in the playback interface to switch the video picture quality to “HD 540P”, “720P”, ” 1080P”, etc. (Figure 2).
Figure 2 Youku quality switch
The higher the resolution, the more points the video screen will consist of, and the more details the screen will show (Figure 3).
Figure 3 different picture quality resolution
Another parameter that affects the smoothness of the video is the frame rate. The higher the frame rate of the video, the smoother the picture appears, especially for moving objects, the less likely it is to have trailing images.
The old movie is transformed – the resolution enhancement of neural network and frame insertion technology
In the “remake” of the old movie shown by the netizen, the netizen mainly used neural network to enhance the resolution and frame rate of the original video.
First of all, the resolution enhancement, in the traditional operation if we need to extend the low resolution image or video to high resolution, the interpolation algorithm is used. It is based on points in the target resolution, which correspond to the source image according to the scaling relationship, while obtaining a balance between blurring and jaggedness at the image edges. However, the image (video) zoomed by the traditional algorithm is prone to blurring and jaggies (Figure 4).
Figure 4 Blurring and jaggies after zooming by interpolation algorithm
This phenomenon can now be effectively avoided by interpolation algorithms that incorporate neural network technology, such as https://bigjpg.com/提供的放大图片技术 (with the help of neural network technology). It first creates a machine training model, then uses a large number of low-resolution images as input sources and obtains the corresponding high-resolution result images. Then, through the neural network, the enlargement algorithm is continuously adjusted and optimized for the lines, colors, dots and other characteristics of the enlarged image, and finally an optimal set of algorithms is generated. This algorithm ensures that the color of the enlarged image is well preserved and the edges of the image are not blurred or jagged, thus achieving a “lossless” enlargement of the low-resolution image into a high-resolution clear image (Figure 5).
Figure 5 Combined with neural network technology to enlarge the picture effect
This time, the user used a technology derived from Gigapixel AI in the video “renovation” operation, which is similar to that of https://bigjpg.com/, except that it can losslessly enlarge every frame of the video, significantly increasing the resolution of the movie without producing It uses a technique similar to that of , except that it can losslessly enlarge every frame of the video, which dramatically increases the resolution of the movie without producing significant blurring and jaggies (Figure 6).
Figure 6: The effect of enlarging the resolution of the movie
The next technique is frame interpolation with neural networks. It also builds a network model that can sense the acceleration of video motion, and the model is trained with a large amount of data to sense the trajectory of the object and add intermediate frames to it, thus increasing the frame rate and making the video playback smoother. If the frame rate of the video is low when it is shot, the parabola of the sphere’s motion will not be visible when it is played. Now we can combine the neural network to calculate the actual path of the rugby ball, and then add new frames in the middle of the parabolic path of the original frame, so that the video will be played more smoothly after increasing the number of frames, and there will not be any sense of inconsistency (Figure 7).
Figure 7: Illustration of frame insertion technique
This time, the netizens calculated the motion trajectory of the train in the original movie through neural network, and then increased the frame rate of the original old movie from 20fps to 60fps through frame insertion technology, and then combined with the above-mentioned resolution enhancement to realize the transformation of the old movie. The smoothness and clarity of the picture is almost comparable to the videos shot by mainstream smartphones nowadays, for example, the iPhone 11 now only supports shooting 4K/60fps videos at the highest.
More applications of neural networks in picture/video processing
As mentioned above, with the help of neural networks we can improve the resolution and frame rate of pictures and videos. These technologies can be used in many aspects of life, for example, we can use it to improve the clarity of old photos, such as those photos in the drawer that were originally taken with a dumb machine, black and white blurred photos that have been saved for a long time, etc., can be scanned into the computer and processed to become clearer digital memories.
Figure 8 Screenshot of slow-motion video shown by Huawei Mate 30 Pro
Of course we can also use neural networks to process video, for example, Huawei’s Mate 30 Pro launched years ago can interpolate 960fps video to generate 7680fps slow-motion video, which perfectly demonstrates the whole process of a drop of water falling from a high place into a cup and splashing out in the official showcase video (Figure 8). It was done with the help of the phone’s built-in neural network technology.