AI self-portrait behind the credit – detailed explanation of GAN

In the article “AI Self-Portrait Reveals the Tip of the Intelligent Creative Iceberg”, CFan has introduced that AI can generate a fake-like picture using GAN technology. So how does GAN achieve these functions? Let’s explore the secrets today.

Uncovering the Fog – Understanding GAN

GAN stands for Generative Adversarial Networks, which translates to generative adversarial networks, and it is a deep learning model. We know that to make the machine have artificial intelligence, a lot of learning is essential, and now AI mostly uses the model of big data + deep learning, and to make the machine learn deeply, the model is an essential element.


Figure 1 GAN illustration

N is Networks, deep neural networks, through which AI can learn and self-learn, and thus master many advanced techniques.

A is Adversarial, the AI becomes a “master” through deep learning, and this “master” will play against itself in the model system, so as to improve its skills in playing against each other.

G is Generative (Model): Generative model, through neural network learning and self-gaming, so that the model can be generated, and eventually through continuous improvement of the model and algorithm, so that AI has extraordinary artificial intelligence.

GAN’s working principle is revealed

Through the above description we briefly understand the composition of GAN, then in the actual operation of AI, how does GAN achieve deep learning, so that AI has a high artificial intelligence?

As mentioned above, GAN is composed of neural network, adversarial and model, of which the core is adversarial. The adversarial model, also known as the GAN framework, consists of a generative model G and a discriminative model D. At the beginning of training, the system transmits a random copy of the real sample data (X) to the discriminant model D. The goal of D is to identify the real sample as correctly as possible (if correct, the output is “true” or “1”). At the same time, a sample of noisy data (Z) is randomly transmitted to the generative model G, which also transmits the processed data to D for discrimination. This way, D has to judge the real data (and identify it as true as possible) and identify the noisy data (and correctly identify the generated samples as possible, i.e., the false samples are output as “false” or 0). Both sides optimize themselves during the training process until a balance is reached, i.e., the false samples are completely indistinguishable from the true samples (Figure 2).


Figure 2 Illustration of GAN framework

Of course, technical terms are always obscure, so we can use soccer as an analogy. In the GAN framework, here we compare the generative model G to a player’s “fake-drop”, where their goal is to confuse the referee by using various means to make their offensive or defensive tactics compliant. For example, the fake fall in front of the goal looks like a real foul, while the discrimination model is equivalent to the referee, whose goal is to find out all kinds of irregularities in the player’s play that are mixed with the compliance. In the constant confrontation between players and referees, players become better and better at “faking” and referees become better and better at recognizing “fakes”. As the level of both players and referees increases, eventually no one can distinguish which play is the real “fake”, that is, the AI’s level has been achieved (Figure 3).


Figure 3 Metaphorical illustration

In practical applications, of course, AI is not used to make stimulants, but refers to its powerful self-correcting and learning capabilities. With the GAN framework, AI can have super abilities that are unimaginable to humans. For example, the self-portrait introduced in the last issue, AI can paint a picture that humans can’t express. There are many interesting applications in similar scenarios. For example, in many advertising ideas, artists invest a lot of time in designing glyphs that are visually compatible with the shapes and textures of other elements, but such hand-designed glyphs are tightly integrated with the current scene, and even for the same image, artists would have to repeat their labor if they want to produce glyphs with the same effect, because according to current technology is difficult to migrate it to other similar projects.

Now with the learning of GAN, AI can easily learn and comprehend the artist’s intention, it will first learn the picture environment, light, scene, etc., then read the various situations of the font in this picture, and split each element of the font to learn, such as the form, color, and technique of the font for precision learning, so as to precisely copy the glyphs that fit the poster scene very well, and these original It is difficult to migrate these glyphs to any port (Figure 4).


Figure 4 Migration of glyphs between different posters

Break through the limits – GAN brings us more

As we can see from the above introduction, with the GAN framework learning, AI can master and even surpass many skills mastered by humans. these technologies of AI can bring a lot of convenience to our life.

For example, GAN’s super high learning ability of fonts, it can be used to learn to generate a variety of fonts, and it can also be used to identify personal fonts and handwriting. So that in the future, if any old scoundrel signs but denies that it is the text of his signature, we may be able to identify it easily without the help of professional identification agencies (Figure 5).


Figure 5 GAN recognizes and generates various fonts

Of course there are many applications of GAN, such as OLDIFY, an application developed based on Age-cGAN, which can synthesize your form after or before any age. With a photo of your youth, you can know in advance what you will look like decades later or recall a decade or so ago, which is not very interesting (Figure 6).


Figure 6 Age synthesis of OLDIFY

Of course the GAN framework has very practical applications in many fields, for example, in the medical field, GAN can help doctors to make a comprehensive and scientific diagnosis of patients quickly in future cases by learning the diagnosis of complex diseases. The police can use GAN to restore the segmentation map from the surveillance video to the real photo, turn the black and white map into a color map, and make accurate identification of the suspect, so that the suspect has nowhere to hide. We expect GAN to bring us more applications!

Leave a Comment