With multilevel data with fixed coefficients we have:
\(y_{ij}=\mathbf x_{ij}\theta +m_j + \epsilon_{ij}\)
We can estimate \(m_j\) using fixed effects or similar methods.
If the data is grouped by whether an entity was treated then will have:
\(y_{i0}\) - the outcome if the entity was not treated
\(y_{i1}\) - the outcome if the entity was treated
However we only observe \(y_i\) and \(D_i\).
\(y_i=y_{i0}+D_i(y_{i1}-y_{10})\)
\(ATE=E[y_{i1}-y_{i0}]\)
\(ATE=E[y_{i1}-y_{i0}|D_i=1]\)
\(ATE=E[y_{i1}|D_i=1]-E[y_{i0}|D_i=1]\)
\(E[y_{i1}-y_{i0}|\mathbf x_i]\)
If the model is:
\(y_i=D_i\theta +g(X) +\epsilon_i\)
And \(D\) is randomly assigned, then we can estimate
\(y_i=D_i\theta +\epsilon_i\)
To get an estimate for \(\theta\) without collecting data on \(X\).
We can simply regress outcomes on variables, including treatment.
This assumes treatment effects are constant.
This also assumes that outcomes \(y_{1i}\) and \(y_{0i}\) are independent of \(D_i\), conditional on \(X\).
If we are missing variables in \(X\) then we will have biased estimates.
This also assumes the effects of \(X\) are linear.
We assume: \(E[y_{0i}|\mathbf x_{i}, D_i]=\mathbf x_i \theta\).
Matching is similar to regression. We assume that effects are constant, and the effect of treatment on \(y_{0i}\) and \(y_{1i}\) are independent of treatment, once controlling for \(X\).
Again, this is biased if this is not the case.
We however do not have to assume a linear form for \(X\).
We assume: \(E[y_{ji}|\mathbf x_{i}, D_i]=E[y_{ji}|\mathbf x_{i}]\)
For each entity, find a near entity which had the opposite treatment.
Match on the chance of getting treatment, given covariates.
\(E[y_{i1}-y_{i0}|\mathbf x_i]\)
We have IVs for treatment.
+ propsentiy score weighting + regression adjustemnt + matching + IV + Regression discontinuity
big page in advanced analytics? Random effects meta analysis?
meta analysis: fixed effect v random effects model
types of study: + RCT + cohort studies + case-control studies + cross sectional studies