세상에서 가장 쉬운 개념 gan입니댯
블로그로 옮겨 적느라 형식상의 오류가 있을 수 있습니다
(특정 공식을 블로그에 적는게 아직은 소캬 소캬 샤캬 샤캬 샤카밤바스 모에모ㅔ 뀽) 암튼 내용 자체는 맞는 내용입니댯 그냥 쳐 읽고 느끼십쇼
Generative Adversarial Networks (GANs) are an advanced framework for learning data distributions through adversarial training, introduced by Ian Goodfellow et al. in 2014. The GAN architecture consists of two neural networks, the Generator (G) and the Discriminator (D), which are trained together in a minimax optimization setting. This essay elucidates the mathematical underpinnings of GANs and culminates in the derivation of the Nash equilibrium, which represents the optimal state of the GAN game.
minGmaxD V(D,G)=Ex∼pdata[logD(x)]+Ez∼pz[log(1−D(G(z)))]
The Discriminator’s goal is to correctly classify real and fake samples by maximizing V(D, G), while the Generator’s goal is to minimize V(D, G) by generating data indistinguishable from real data.
To understand the dynamics of GANs, let us first analyze the behavior of the Discriminator when the Generator G is fixed. The Discriminator seeks to maximize the objective:
V(D,G)=Ex∼pdata[logD(x)]+Ez∼pz[log(1−D(G(z)))]
The optimal Discriminator D∗(x) can be derived by taking the functional derivative of V(D,G) with respect to D. Solving for D gives:
D∗(x)=pdata(x)pdata(x)+pg(x)
With the Discriminator’s optimal behavior defined, the Generator’s objective becomes minimizing the following function:
Ez∼pz[log(1−D∗(G(z)))]
Substituting D∗(x):
Ez∼pz[log(1−pdata(x)pdata(x)+pg(x))]
This simplifies to:
Ez∼pz[log(pg(x)pdata(x)+pg(x))]
Instead of directly minimizing this function, a reformulation is often used to stabilize training by maximizing Ez∼pz[logD(G(z))]. This alternative objective encourages the Generator to increase the probability of the Discriminator being fooled.
GANs implicitly minimize the Jensen-Shannon Divergence (JSD) between the real data distribution pdata and the generated data distribution pg:
JS(pdata∥pg)=12KL(pdata∥M)+12KL(pg∥M)
where M=12(pdata+pg)
The JSD measures the similarity between the two distributions, ensuring that as G improves, pg→pdata, and the divergence approaches zero. The connection to JSD emerges from the structure of the GAN’s loss function, where the Discriminator learns to distinguish between real and fake samples, effectively estimating the divergence between the distributions.
In the context of GANs, the Nash equilibrium represents the point at which neither the Generator nor the Discriminator can improve their performance. Mathematically:
When pg(x)=pdata(x), the Discriminator cannot distinguish between real and fake samples.
At this point, D(x)=12 for all x, indicating that the Discriminator assigns equal probability to both real and generated data.
The GAN objective at Nash equilibrium becomes:
V(D∗,G∗)=Ex∼pdata[log12]+Ez∼pz[log12]=−log2
This equilibrium signifies that the Generator has successfully learned the true data distribution, and the Discriminator’s predictions are maximally uncertain.
Conclusion
The Nash equilibrium in GANs arises when the Generator produces a data distribution pg that matches the real data distribution pdata. At this point, the Discriminator’s ability to distinguish real from fake samples vanishes, resulting in equal probabilities for both. This state encapsulates the essence of GANs: adversarial training drives the Generator to perfectly model the data distribution, while the Discriminator provides feedback until its task becomes redundant. The interplay between G and D forms a dynamic system that, under ideal conditions, converges to this elegant equilibrium.
'수학(Curiosity)' 카테고리의 다른 글
happy problem (0) | 2025.01.12 |
---|---|
The Magic of e^x: A Function That Is Its Own Derivative (0) | 2025.01.11 |
impossible math problem made by me (1) | 2025.01.08 |
Math Problem Navigating the Celestial Labyrinth (1) | 2025.01.05 |
testing testing (0) | 2025.01.05 |