We have a sample space, \(\Omega\). A random variable \(X\) is a mapping from the sample space to the real numbers:
\(X: \Omega \rightarrow \mathbb{R}\)
We can then define the set of elements in \(\Omega\). As an example, take a coin toss and a die roll. The sample space is:
\(\{H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6\}\)
A random variable could give us just the die value, such that:
\(X(H1)=X(T1)=1\)
We can define this more precisely using set-builder notation, by saying the following is defined for all \(c\in \mathbb{R}\):
\(\{\omega |X(\omega )\le c\}\)
That is, for any number random variable map \(X\), there is a corresponding subset of \(\Omega\) containing the \(\omega\)s in \(\Omega\) which map to less than \(c\).
Multiple variables can be defined on the sample space. If we rolled a die we could define variables for
Whether it was odd/even
Number on the die
Whether it was less than 3
With more die we could add even more variables
If we define a variable \(X\), we can also define another variable \(Y=X^2\).
\(P(X=x)=P({\omega |X(\omega)=x})\)
For discrete probability, this is a helpful number. For example for rolling a die.
This is not helpful for continuous probability, where the chance of any specific outcome is \(0\).
Random variables all valued as real numbers, and so we can write:
\(P(X\le x)=P({\omega |X(\omega)\le x})\)
Or:
\(F_X(x)=\int_{-\infty}^x f_X(u)du\)
\(F_X(x)=\sum_{x_i\le x}P(X=x_i)\)
\(P(X\le x)+P(X\ge x)-P(X=x)=1\)
\(P(a< X\le b)=F_X(b)-F_X(a)\)
If continuous, probability at any point is \(0\). We instead look at probability density.
Derived from cumulative distribution function:
\(F_X(x)=\int_{-\infty}^x f_X(u)du\)
The density function is \(f_X(x)\).
For probability mass functions:
\(P(Y=y|X=x)=\dfrac{P(Y=y\land X=x)}{P(X=x)}\)
For probability density functions:
\(f_Y(y|X=x)=\dfrac{f_{X,Y}(x,y)}{f_X(x)}\)
\(P(X=x\land Y=y)\)
\(P(X=x)=\sum_{y}P(X=x\land Y=y)\)
\(P(X=x)=\sum_{y}P(X=x|Y=y)P(Y=y)\)
\(x\) is independent of \(y\) if:
\(\forall x_i \in x,\forall y_j \in y (P(x_i|y_j)=P(x_i)\)
If \(P(x_i|y_j)=P(x_i)\) then:
\(P(x_i\land y_j)=P(x_i).P(y_j)\)
This logic extends beyond just two events. If the events are independent then:
\(P(x_i\land y_j \land z_j)=P(x_i).P(y_j \land z_k)=P(x_i).P(y_j).P(z_k)\)
Note that because:
\(P(x_i|y_j)=\dfrac{P(x_i\land y_j)}{P(y_j)}\)
If two variables are independent
\(P(x_i|y_j)=\dfrac{P(x_i)P(y_j)}{P(y_j)}\)
\(P(x_i|y_j)=P(x_i)\)
\(P(A\land B|X)=P(A|X)P(B|X)\)
This is the same as:
\(P(A|B\land X)=P(A|X)\)