For parametric regression we have:
\(y=f(X)\)
Where the form of \(f(X)\) is fixed, such as for linear regression.
For non-parametric regression we have:
\(y=m(X)\)
Where \(m(X)\) is not fixed.
We can estimate \(m(X)\) using kernel regression.
\(m(X)=\dfrac{\sum_{i=1}^nK_h(x-x_i)y_i}{\sum_{i=1}^nK_h(x-x_i)}\)
We know this because we have:
\(E(y|X)=\int yf(y|x)dy=\int y\dfrac{f(x,y)}{f(x)}dy\)
We then use kernel density estimation for both.
A linear model looks like:
\(\hat y =c+\sum_i x_i\theta_i\)
MARS instead produces a linear model for subsets of X.
\(\hat y =c+\sum_i B_j(x_i,a_j)\theta_i\)
Where:
\(B_j=max(0, x_i-a_j)\); or
\(B_j=-max(0, a_j-x_i)\)
This is trained using a forward pass and a backward pass.
In other supervised?
Normally we return a central estimate, commonly the mean.
Quantile regression returns an estimate of the \(i\)th quartile instead.
Goal is to find xth quartile of variance.
Do PCA on X.
Do OLS with this.
Transform parameters by reversing PCA procedure on parameters.
This expands on principal component regression.
Both X and Y are mapped to new spaces.