More conditions than data.
We have a function on the output and a parameter:
\(g(y, \theta )\)
A moment condition is that the expectation of such a function is \(0\).
\(m(\theta )=E[g(y, \theta )]=0\)
To do GMM, we estimate this using:
\(\hat m(\theta )=\dfrac{1}{n}\sum_ig(y_i, \theta )\)
We define:
\(\Omega = E[g(y, \theta )g(y, \theta)^T]\)
\(G=E[\Delta_\theta g(y, \theta)]\)
And then minimise the norm:
\(||\hat m(\theta )||^2_W=\hat m(\theta )^TW\hat m(\theta )\)
Where \(W\) is a positive definite matrix for the norm.
\(\Omega ^{-1}\) is most efficient. But we don’t know this. It depends on \(\theta\).
We can estimate it if IID:
\(\hat W(\hat \theta )= (\dfrac{1}{n}\sum_i g(y, \hat \theta)g(y, \hat \theta)^T)^{-1}\)
Estimate using \(\mathbf W=\mathbf I\)
Consistent, but not efficient.
OLS:
\(E[x(y-x\theta)]=0\)
WLS
\(E[x(y-x\theta)/\sigma^(x)]=0\)
IV
\(E[z(y-x\theta)]=0\)
MLE
\(E[\Delta_\theta \ln f(x, \theta)]=0\)
\(m(\theta_0)=E[g(\mathbf x_i, \theta_0]\)
We replace this with sample moment
\(\hat m(\theta)=\frac{1}{n}\sum_ig(\mathbf x_i, \theta)\)
We have the "score"
\(\nabla_\theta g(\mathbf x_i, \theta_0)\)
Information
\(G=E[\nabla_\theta g(\mathbf x_i, \theta_0)]\)
Variance-covariance loss matrix
\(\Omega =E[g(\mathbf x_i, \theta_0)g(\mathbf x_i, \theta_0)^T]\)
We want to minimise moment loss
\(||\hat m(\theta)||^2_W=\hat m(\theta )^TW\hat m(\theta)\)
\(\hat \theta = argmin_\theta (\frac{1}{n}\sum_ig(\mathbf x_i, \theta))^T\hat W(\frac{1}{n}\sum_ig(\mathbf x_i, \theta))\)
CLT means normal.
They are consistent IF moment condition is true.
There is an explicit formula for variance.
\(\sqrt n (\hat \theta -\theta_0)\rightarrow^d N[0, (G^TWG)^{-1}G^TW\Omega W^TG(G^TW^TG)^{-1}]\)
If we choose \(W\propto \Omega^{-1}\) then:
\(\sqrt n (\hat \theta -\theta_0)\rightarrow^d N[0, (G^T\Omega^{-1} G)^{-1}]\)
Problem: we need to estimate \(\Omega\) and \(G\).
\(\Omega\): estimate from sample. allows us to choose estimator, but still leaves variance unidentified.
Do the above from OLS? This is where robust etc stuff comes from
If it is specified. Moment conditions are equal to the number of moments, then \(W\) doesn’t matter. This is normal Method of Moments.
Estimating the weighting matrix
page on Bias and variance of the GMM estimator (cluster assumption should be part of moment condition?) part of later calculation of weighting?
Can do robust, hac, clustering as part of GMM too.