Get Projection Matrix essential facts below. View Videos or join the Projection Matrix discussion. Add Projection Matrix to your PopFlock.com topic list for future reference or share this resource on social media.
In statistics, the projection matrix, sometimes also called the influence matrix or hat matrix, maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes the influence each response value has on each fitted value. The diagonal elements of the projection matrix are the leverages, which describe the influence each response value has on the fitted value for that same observation.
If the vector of response values is denoted by and the vector of fitted values by ,
As is usually pronounced "y-hat", the projection matrix is also named hat matrix as it "puts a hat on ". The formula for the vector of residuals can also be expressed compactly using the projection matrix:
where is the identity matrix. The matrix is sometimes referred to as the residual maker matrix. Moreover, the element in the ith row and jth column of is equal to the covariance between the jth response value and the ith fitted value, divided by the variance of the former:
A matrix, has its column space depicted as the green line. The projection of some vector onto the column space of is the vector
From the figure, it is clear that the closest point from the vector onto the column space of , is , and is one where we can draw a line orthogonal to the column space of . A vector that is orthogonal to the column space of a matrix is in the nullspace of the matrix transpose, so
From there, one rearranges, so
Therefore, since is on the column space of , the projection matrix, which maps onto is just , or
Suppose that we wish to estimate a linear model using linear least squares. The model can be written as
For linear models, the trace of the projection matrix is equal to the rank of , which is the number of independent parameters of the linear model. For other models such as LOESS that are still linear in the observations , the projection matrix can be used to define the effective degrees of freedom of the model.
Practical applications of the projection matrix in regression analysis include leverage and Cook's distance, which are concerned with identifying influential observations, i.e. observations which have a large effect on the results of a regression.
Suppose the design matrix can be decomposed by columns as .
Define the hat or projection operator as . Similarly, define the residual operator as .
Then the projection matrix can be decomposed as follows:
where, e.g., and .
There are a number of applications of such a decomposition. In the classical application is a column of all ones, which allows one to analyze the effects of adding an intercept term to a regression. Another use is in the fixed effects model, where is a large sparse matrix of the dummy variables for the fixed effect terms. One can use this partition to compute the hat matrix of without explicitly forming the matrix , which might be too large to fit into computer memory.