**FengHZ‘s Blog首发原创**

Based on the latent variable assumption, the essential purpose of GAN is to map the distribution of latent variable $z\sim p(z)$ into the data distribution $x\sim p_{data}$. GAN utilizes the equilibrium theory 2-players game to find the optimal map. There have already been many articles introducing the structure of GAN with comparing GAN to the confrontation between counterfeiters and discriminators. However, understanding the basic mathemathical form of GAN is very important which can tell us why the equilibrium point can achieve the complex real data distribution $x\sim p_{data}$ without making any assumptions on the analytical form of $p_{data}$.

Given the discriminator $D$ and the generator $G$, the nash equilibrium point of the 2-players game

\[\min_{G}\max_{D} E_{x\sim p_{data}(x)} \log D(x) + E_{z\sim p(z)} \log (1-D(G(z))) \tag{1}\]-
**proposition**The global optimal $(D,G)$of $(1)$ satisfy \(D(x) = \frac{p_{data}(x)}{p_{data}(x)+p_{g}(x)} \tag{2}\)\(z\sim p(z), G(z) \sim p_{data}(x)\tag{3}\) Here we use $p_{g}$ to indicate the distribution of $G(z),z\sim p(z)$

\[\max_{D} E_{x\sim p_{data}(x)} \log D(x) + E_{x\sim p_{g}(x)} \log (1-D(x)) \\ =\max_{D} \int_{x} p_{data}(x) \log D(x)+p_{g}(x)\log (1-D(x)) dx\\ =^{discretization} \sum_{i=1}^{N} p_{data}(x_i) \log D(x_i)+p_{g}(x_i)\log (1-D(x_i)) \Delta_{x} \\\]*proof for $(2)$*In the discretization form, we make $D$ achieve the maximum for each point $x_i$ and then get the global optimal, then we can get

\[D(x) = \frac{p_{data}(x)}{p_{data}(x)+p_{g}(x)}\]We substitude $(2)$ into the equation $(1)$ and can get

\[\min_{p_{g}} E_{x\sim p_{data}(x)} \log D(x) + E_{x\sim p_{g}(x)} \log (1-D(x))\\ =\min_{p_{g}}\int_{x} p_{data}(x)\log \frac{p_{data}(x)}{p_{data}(x)+p_{g}(x)} + p_{g}(x)\log \frac{p_{g}(x)}{p_{data}(x)+p_{g}(x)}dx\\ =\min _{p_{g}} J[p_{g}] \tag{4}\]We use the variational method to solve $(4)$ with the constraint $\int_{x} p_{g}(x) dx =1$, and following are the

\[L[p_{g}] = J[p_{g}] + \lambda_{1}[\int_{x} p_{g}(x) dx -1] \tag{5}\]*Lagrange*formUsing the

\[\log \frac{p_{g}}{p_{data}+p_{g}} + \lambda_{1}=0\]*basic formulation of variational method*, we havewhich indicates

\[p_{g}=\frac{p}{e^{\lambda_1}-1}\]combined with $\int_{x} p_{g}(x) dx =1$, we have

\[p_g =p_{data}\]which is the essential part of GAN.