Moment generating function is given by Cumulant characteristic function is given by
Central moments$ _l$ is given by
Skewness and Kurtosis are the normalised and central moments of a distribution respectively. The normalization factors are and respectively where is the standard deviation of X.
The quantity is called the excess kurtosis since is the kurtosis for a normal distribution.
Let be a random sample of X with T observations
Sample Mean is given by Sample Variance is given by Sample Skewness is given by Sample Kurtosis is given by
Univaiate Distributions
Normal Distribution
A random variable is said to be normally distrbuted if it has a probability density function as follows
It is a continous probability distribution
and are the mean and variance of the distribution respectively
The case where and is called standard normal distribution and its PDF is given by
import numpy as npimport math import matplotlib.pyplot as pltimport scipy.stats as stfrom mpl_toolkits import mplot3ddef plotNormalPDF_CDF_CHF(mu ,sigma): i =complex(0,1) chf =lambda u : np.exp(i*mu*u -(sigma**2)*u*u/2) pdf =lambda x : st.norm.pdf(x,mu,sigma) cdf =lambda x : st.norm.cdf(x,mu,sigma) x = np.linspace(5,15,100) u = np.linspace(0,5,250)print(type(pdf))# figure 1 ,PDF plt.figure(1) plt.plot(x,pdf(x)) plt.grid() plt.xlabel('x') plt.ylabel('PDF')# figure 2 ,CDF plt.figure(2) plt.plot(x,cdf(x)) plt.grid() plt.xlabel('x') plt.ylabel('CDF')# figure 3 ,CHF plt.figure(3) ax = plt.axes(projection ='3d') chfV = chf(u) x = np.real(chfV) y = np.imag(chfV) ax.plot3D(u,x,y,'red') ax.view_init(30 ,-120)plotNormalPDF_CDF_CHF(10,1)
<class 'function'>
Log Normal Distibution
A random Variable is said to have log normal distibution if and is normally distributed.
The PDF of log normal distribution is given by
where and are the mean and variance of respectively.
Hence the mean and variance of X are as follows
Important thing to note here is that can take values in only.
Multivariate Distributions
Correlation
The correlation coefficient between two random variables and is defined as
The sample correlation is given by
Two-dimensional densities.
The joint CDF of two random variables , and ,is the function ,which is defined by:
If and are continous variables, then the joint PDF of X and Y is a function of Bivariate Normal density functions
and
import numpy as npimport matplotlib.pyplot as plt#from matplotlib.mlab import bivariate_normal bivariate_normal seems to be deprecateddef bivariate_normal(X, Y, sigmax=1.0, sigmay=1.0, mux=0.0, muy=0.0, sigmaxy=0.0):""" Bivariate Gaussian distribution for equal shape *X*, *Y*. See `bivariate normal <http://mathworld.wolfram.com/BivariateNormalDistribution.html>`_ at mathworld. """ Xmu = X-mux Ymu = Y-muy rho = sigmaxy/(sigmax*sigmay) z = Xmu**2/sigmax**2+ Ymu**2/sigmay**2-2*rho*Xmu*Ymu/(sigmax*sigmay) denom =2*np.pi*sigmax*sigmay*np.sqrt(1-rho**2)return np.exp(-z/(2*(1-rho**2))) / denomdef BivariateNormalPDFPlot():# Number of points in each direction n =40;# parameters mu_1 =0; mu_2 =0; sigma_1=1; sigma_2=0.5; rho1=0.0 rho2=-0.8 rho3=0.8 x = np.linspace(-3.0,3.0,n) y = np.linspace(-3.0,3.0,n) X,Y =np.meshgrid(x,y) Z =lambda rho:bivariate_normal(X,Y,sigma_1,sigma_2,mu_1,mu_2,rho*sigma_1*sigma_2) fig =plt.figure(1) ax = fig.add_subplot(projection='3d') ax.plot_surface(X, Y, Z(rho1),cmap='viridis',linewidth=0) ax.set_xlabel('X axis') ax.set_ylabel('Y axis') ax.set_zlabel('Z axis') plt.show() fig =plt.figure(2) ax = fig.add_subplot(projection='3d') ax.plot_surface(X, Y, Z(rho2),cmap='viridis',linewidth=0) ax.set_xlabel('X axis') ax.set_ylabel('Y axis') ax.set_zlabel('Z axis') plt.show() fig =plt.figure(3) ax = fig.add_subplot(projection='3d') ax.plot_surface(X, Y, Z(rho3),cmap='viridis',linewidth=0) ax.set_xlabel('X axis') ax.set_ylabel('Y axis') ax.set_zlabel('Z axis') plt.show()BivariateNormalPDFPlot()
Hypothesis Testing
t-statistic is the ratio of departure of the estimated value of a paramater from its hypothesized value to it’s standard error.
It is used when the sample size is small or the population standard deviation is unknown.
Let be an estimator of parameter in some statistical model. Then the t-statistic is given by where is the standard error of the estimator for and is a non-random , know constant , which may or maynot match actual unknow parameter value