1.2 Moment Generating and Characteristic Functions
Definition 1.16 Let \(X\) be a RV. 1. The moment generating function MGF, or Laplace transform, of \(X\) is \(\varphi: \mathbb{R}\to \mathbb{R}\) defined by \[ \varphi_X (t) = \mathbb{E}\left( e^{t X} \right), \] where \(t\) varies over the real numbers.
- The characteristic function, or Fourier transform of \(X\) is \(\varphi: \mathbb{R}\to \mathbb{C}\) defined by \[\phi_X(\theta) = \mathbb{E}e^{i\theta X} .\]
Lemma 1.1
Let \(X\) be a RV and \(Y = aX + b\), then \[ \varphi_Y(t) = e^{bt} \varphi_{X}(at)\]
\[ \varphi_X^{(k)}(0) = \mathbb{E}(X^k) \]
Let \(X_i\), \(i= 1, \dots, n\) be independent RVs and \(Y = \sum_i X_i\). Then \[ \varphi_Y (t) = \prod \varphi_{X_i}(t). \]
\[| \phi (\theta) | \leq 1\]
Denote \(\overline{z}\) to be the complex conjugate of \(z\) in the complex plane. \[ \phi_{-X} (\theta) = \overline{\phi_X (\theta)} \]
\[\phi_Y (\theta) = e^{i b \theta} \phi(a\theta) \]
Exercise 1.12 Prove the above lemma.
Exercise 1.13 Let \(X \sim \exp(1)\), i.e, \[ f_X(x) = \begin{cases} e^{-x} \,, x \geq 0 \\ 0 \,, x < 0 \,. \end{cases}\] Compute \(\varphi_X\).
Recall that two RVs \(X \stackrel{d}{=}Y\) means that \(F_X (x) = F_Y(x)\). Two common ways to characterize the equality in distribution are to use the generating functions and the characteristic functions.
These ideas are not orginally from probability but from engineering/mechanics, where Laplace and Fourier transforms are understood very well since the 18th century.
Exercise 1.14 In general, differentiation is not commutative with integration, that is \[ \frac{d}{dt} \int \not= \int \frac{d}{dt}. \] However, assuming that this is true for certain moment generating functions \(\varphi_X\). Show that \[ \varphi_X^{(n)}(0) = \mathbb{E}( X^n),\] where \(f^{(n)}\) denotes the \(n\)-th derivative of \(f\). \(\mathbb{E}(X^n)\) is called the \(n\)-th moment of \(X\) and it tells you the tail behavior of \(f_X\).
1.2.1 Moment Generating Functions
Theorem 1.5 Let \(X\) and \(Y\) be RVs. If \(\varphi_X(t) = \varphi_Y(t)\) for all \(t\) in an interval around 0, then \[ X \stackrel{d}{=}Y \,.\]
The full proof of this is beyond this class (and could be a great topic for a project). However, we will prove this for finte RVs.
Proposition 1.1 (Finite RV case) Let \(X,Y: \Omega \to \{1,2, \dots, N\}\) be RVs. If \(\varphi_X(t) = \varphi_Y(t)\) in an interval around in an interval \((-\epsilon , \epsilon)\), then \[ X \stackrel{d}{=}Y \,.\]
Proof. We have that \[ \varphi_X (t) = \mathbb{E}( e^{tX}) = \sum_{i= 1}^N e^{it}\mathbb{P}( X = i) \] and \[ \varphi_Y (t) = \mathbb{E}( e^{tX}) = \sum_{i= 1}^N e^{it}\mathbb{P}( Y = i). \] Therefore, \[ 0 = \varphi_X(t) - \varphi_Y(t) = \sum_{i=1}^N (e^t)^i \left( \mathbb{P}(X = i) - \mathbb{P}(Y = i) \right) \] for every \(t \in (-\epsilon , \epsilon)\). Therefore, as the above is a polynomial, \[ \mathbb{P}( X = i) = \mathbb{P}(Y = i) \] where \(i = 1, \dots, N\).
Note that if the above summation is infinite, then we cannot conclude that \(X\) and \(Y\) has the same distribution as easily as we just did. More work has to be done to show this.
A note of caution: the assumption that \(\varphi_X = \varphi_Y\) in an interval around \(0\) is crucial in general.
An interesting observation arises: for analytic functions we have the Taylor series \[ f(x) = \sum_{i=0}^\infty \frac{f^{(n)}(0)}{n!} x^n. \] Exercise 1.14 tells you that the \(n\)-th derivative at \(0\) of a moment generating function would be the \(n\)-th moment of the RV.
Question: Is knowing the moments of \(X\) enough to determine its probability distribution?
The answer is NO. One can take a look at the discussion about this problem here: https://mathoverflow.net/questions/3525/when-are-probability-distributions-completely-determined-by-their-moments.
However, things are nice for finite RVs.
Proposition 1.2 Let \(X,Y: \Omega \to \{1,2, \dots, N\}\) be RVs. Suppose that \[ \mathbb{E}(X^n) = \mathbb{E}(Y^n) < \infty\] for every \(n\in \mathbb{N}\). Then \[ X \stackrel{d}{=}Y .\]
Proof. Consider \[ \varphi_X(t) = \mathbb{E}(e^{Xt}) = \sum_{i = 1}^N e^{it} \mathbb{P}( X = i). \] This is a finite sums of analytic functions and is, therefore, analytic. Thus, \(\varphi_X\) can be expanded into Taylor series, i.e., \[ \varphi_X(t) = \sum_{n=0}^\infty \frac{\varphi_X^{(n)}(0)}{n!} t^n = \sum_{n=0}^\infty \frac{\mathbb{E}(X^n)}{n!} t^n.\] This means that the moments of \(X\) determines its moment generating function (which may not be true in general).
A similar argument can be made for \(\varphi_Y\) and as the coefficents of the Taylor series are the same (being the moments of \(X\) and \(Y\)), we conclude that \[\varphi_X = \varphi_Y.\] Therefore, by Theorem 1.5, \[ X \stackrel{d}{=}Y, \] as desired.
1.2.2 Characteristic Functions
Similar idea with the moment generating functions, but characteristic functions are easier to work with and we don’t have to work with special case of finite RVs.
Theorem 1.6 Let \(X\) and \(Y\) be RVs. If \(\phi_X(t) = \phi_Y(t)\) for all \(t\) in an interval around 0, then \[ X \stackrel{d}{=}Y \,.\]
In order to prove this theorem, we need the following important result, called inversion formula of the characteristic functions.
Theorem 1.7 (Inversion Formula) Let \(X:\Omega \to S\) be a RV (either continuous or discrete) and \(\phi_X\) be its characteristic function. Then \[ \lim_{T \to \infty} \frac{1}{2\pi}\int_{-T}^T \frac{e^{-i\theta a} - e^{-i\theta b}}{i\theta} \phi_X(\theta) \, d\theta = \mathbb{P}( a < X < b) + \frac{1}{2} \left( \mathbb{P}(X = a) + \mathbb{P}(X = b) \right). \]
Proof. We have \[\begin{aligned} \frac{1}{2\pi} \int_{-T}^T \frac{ e^{ - i \theta a} - e^{- i \theta b}}{i\theta} \phi_X(\theta) \, d\theta & = \frac{1}{2\pi} \int_{-T}^T \frac{ e^{ - i \theta a} - e^{- i \theta b}}{i\theta} \int_{\mathbb{R}} e^{i\theta x} f_X(x) \, d\theta dx \\ & = \int_{\mathbb{R}}\frac{1}{\pi} \int_{-T}^T \frac{ e^{ i \theta (x - a)} - e^{i \theta (x- b)}}{ 2 i\theta} f_X(x) \, d\theta dx \\ \end{aligned}\]
Note that since \(\cos(t)/t\) is odd and \(\sin(t)/t\) is even, and that \(e^{i\theta} = \cos(\theta) + i \sin(\theta)\), we have \[ \frac{1}{2}\int_{-T}^T \frac{e^{i\theta c}}{i \theta} = \int_0^T \frac{\sin(\theta c)}{\theta} \, d\theta. \]
Therefore, \[ \begin{aligned} \frac{1}{2\pi} \int_{-T}^T \frac{ e^{ - i \theta a} - e^{- i \theta b}}{i\theta} \phi_X(\theta) \, d\theta &= \frac{1}{\pi} \int_{\mathbb{R}} \int_0^T \left( \frac{\sin((x - a)\theta)}{\theta} - \frac{\sin((x - b)\theta)}{\theta} \right) f_X(x) \, d\theta dx \end{aligned} \] Taking the limit \(T\to\infty\) and using the fact that \[ \lim_{T \to \infty}\int_0^T \frac{\sin ((x-a)\theta)}{\theta} d\theta = \begin{cases} \frac{-\pi}{2} \,, & x < a \,,\\ \frac{\pi}{2} \,, & x > a \,, \\ 0 \,, & x = a \,. \end{cases}\] Therefore, \[ \begin{aligned} \lim_{T\to \infty}\frac{1}{2\pi} \int_{\mathbb{R}}\int_{-T}^T \frac{ e^{ - i \theta a} - e^{- i \theta b}}{i\theta} \phi_X(\theta) \, d\theta dx &= \left(\int_{(a,\infty)} f_X(x) \, dx - \int_{ (-\infty, a]} f_X(x) \, dx \right)\\ & \quad - \left(\int_{(b,\infty)} f_X(x) \, dx - \int_{ (-\infty, b]} f_X(x) \, dx \right) \\ &= (\mathbb{P}(X>a) - \mathbb{P}(X\leq a)) - (\mathbb{P}(X>b) - \mathbb{P}(X\leq b)) \\ &= \mathbb{P}( a < X < b) + \frac{1}{2} \left( \mathbb{P}(X = a) + \mathbb{P}(X = b) \right), \end{aligned} \] as desired.
Exercise 1.15 Verify that \[\lim_{T \to \infty} \int_0^\infty \frac{\sin(x)}{x} \, dx = \frac{\pi}{2}.\] If you can’t, watch this: https://www.youtube.com/watch?v=Bq5TB6cZNng.
Another way is to use contour integral from complex analysis.
Exercise 1.16 Let \(X_1, \dots, X_n \sim \mathrm{Uniform}(0,1)\) be independent and \(Y_n = \max\{ X_1, \dots, X_n \}\). Find \(\mathbb{E}(Y_n)\).
Exercise 1.17 Let \(X:\Omega \to (0,\infty)\) be continuous positive RV. Suppose \(\mathbb{E}(X)\) exist. Show \(\mathbb{E}(X) = \int_0^\infty \mathbb{P}(X > x) \, dx\). (Hint: Fubini. This is called the layer cake theorem).
Exercise 1.18 The exponential distribution with parameter \(\lambda\) (denoted by \(\exp(\lambda)\)) is used to model waiting time (see https://en.wikipedia.org/wiki/Exponential_distribution). The probability density function of the exponential distribution is given by \[f(x) = \begin{cases} \lambda e^{-\lambda x} & x\geq 0 \\ 0 & x< 0 \end{cases}.\]
Find the moment-generating function of \(X \sim \exp(\lambda)\).
Use moment-generating function to show that if \(X\) is exponential distributed, then so is \(cX\).
Exercise 1.19 Let \(X \sim N(\mu_1, \sigma_1^2)\) and \(Y \sim (\mu_2, \sigma_2^2)\) be independent. Use the moment generating function to show that \(Z = c_1 X + c_2 Y\) is again a normal distribution. What are \(\mathbb{E}(Z)\) and \(\mathbb{V}(Z)\)?
Exercise 1.20 Find the moment-generating function of a Bernoulli RV, and use it to find the mean, variance, and third moment.
Exercise 1.21 Let \(X: \Omega \to S\) be a RV and \(S = \mathbb{N}\). The probability generating function of \(X\) is defined to be \[ G(s) = \sum_{k=1}^\infty s^k \mathbb{P}(X = k). \]
Show that \[ \mathbb{P}( X = k) = \frac{1}{k!} \frac{d^k}{ds^k} G(s) \vert_{s=0} \]
Show that \[ \frac{dG}{ds} \vert_{s=1} = \mathbb{E}(X) \] and \[ \frac{d^2G}{ds^2} \vert_{s=1} = \mathbb{E}[X(X-1)]. \]
Express the probability-generating function in terms of moment-generating function.
Find the probability-generating function of the Poisson distribution.