pam {pamfe} | R Documentation |
Fits additive panel data models with fixed effects based on the gam
function from package mgcv
and the plm
function from package plm
.
Nonparametric model components are represented by penalized B-splines with smoothing parameters selected by ML or REML. For more details see gam
from package mgcv
.
pam(formula, data = list(), weights = NULL, method = "REML", knots = NULL, optimizer = c("outer", "newton"), control = list(), sp = NULL, gls = TRUE, corMatrix = list(), ...)
formula |
A pam formula which is similar to the formula for a |
data |
A data frame of class |
weights |
An optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, the overall magnitude of the log likelihood is not changed, i.e. the weights are normalized ( |
method |
The smoothing parameter estimation method. |
knots |
This is an optional list containing user specified knot values to be used for basis construction. The user simply supplies the knots to be used, which must match up with the |
optimizer |
An array specifying the numerical optimization method to use to optimize the smoothing
parameter estimation criterion (given by |
control |
A list of fit control parameters to replace defaults returned by |
sp |
A vector of smoothing parameters can be provided here. Smoothing parameters must be supplied in the order that the smooth terms appear in the model formula. Negative elements indicate that the parameter should be estimated, and hence a mixture of fixed and estimated parameters is possible. If smooths share smoothing parameters then |
gls |
If this argument is TRUE (the default value), then serial error correlation inherent to the first-difference transformation to remove fixed effects is accounted for via a generalized least squares approach. |
corMatrix |
This is an optional list containing matrices describing the within-individual variance and correlation structure for the errors of each individual. Such matrices can easily be generated with the help of |
... |
Further arguments for passing on e.g. to |
An additive panel data models with fixed effects is a model which is capable to include individual-specific time constant effects, nonparametric effects and strictly parametric effects jointly. The fixed effects are removed by building first differences over time. The resulting dependence structure can be accounted for via a generalized least squares approach. Nonparametric effects are represented by penalized B-splines. The tradeoff between penalizing wiggliness and penalizing badness of fit is steered by associated smoothing parameters which are estimated by (restricted) maximum likelihood. For further information, see Puetz and Kneib (2016).
Note that gam
from package mgcv
is more comprehensive (e.g. it allows for generalized additive models) and offers more options to specify. The major difference is that the mgcv
package is designed for cross-sectional data and panel data models with random effects.
Details of the default underlying fitting methods are given in Wood (2011 and 2004). A concise introduction to generalized additive models and their implementation in R is given by Wood (2006).
An object of class pam
, similar to a gam
object from package mgcv
. A pam
object has has the following elements:
aic |
AIC of the fitted model: bear in mind that the degrees of freedom used to calculate this are the effective degrees of freedom of the model, and the likelihood is evaluated at the maximum of the penalized likelihood in most cases, not at the MLE. |
assign |
Array whose elements indicate which model term (listed in
|
boundary |
Did parameters end up at boundary of parameter space? |
coefficients |
The coefficients of the fitted model. Parametric coefficients are first, followed by coefficients for each spline term in turn. |
control |
The |
converged |
Indicates whether or not the iterative fitting method converged. |
db.drho |
Matrix of first derivatives of model coefficients w.r.t. log smoothing parameters. |
df.null |
Null degrees of freedom. |
df.residual |
Effective residual degrees of freedom of the model. |
edf |
Estimated degrees of freedom for each model parameter. Penalization means that many of these are less than 1. |
edf1 |
Similar, but using alternative estimate of EDF. Useful for testing. |
edf2 |
This edf accounts for smoothing parameter
uncertainty. |
family |
Family object specifying distribution (always gaussian) and link (always identity link) used. |
fitted.values |
The fitted values for the model. Note that the model is fitted on data transformed by first differences. |
formula |
The model formula. |
gcv.ubre |
The minimized smoothing parameter selection score: negative log marginal likelihood or negative log restricted likelihood. |
gls |
TRUE if serial error correlation inherent to the first-difference transformation to remove fixed effects was accounted for via a generalized least squares approach. |
hat |
Array of elements from the leading diagonal of the ‘hat’ (or ‘influence’) matrix. Same length as response data vector. |
index_data |
The individual dimension and the time dimension of the original panel data set. |
index_diffdata |
The individual dimension (the ids) of the first-differenced data set. |
iter |
How many iterations were required to find the smoothing parameters? |
method |
One of |
model |
Model frame containing all variables needed in original model fit. |
n |
Number of observation used for the fittind process, i.e. after the first-difference transformation. |
nsdf |
Number of parametric, non-smooth, model terms. |
optimizer |
|
outer.info |
If ‘outer’ iteration has been used to fit the model (see
|
prior.weights |
Prior weights on observations. |
pterms |
|
R |
Factor R from QR decomposition of weighted model matrix, unpivoted to be in same column order as model matrix (so need not be upper triangular). |
rank |
Apparent rank of fitted model. |
reml.scale |
The scale (RE)ML scale parameter estimate. |
residuals |
The residuals for the fitted model. Note that the model is fitted on data transformed by first differences. |
rV |
If present, |
scale |
When present, the scale (as |
scale.estimated |
|
sig2 |
Estimated or supplied variance/scale parameter. |
smooth |
List of smooth objects, containing the basis information for each term in the model formula in the order in which they appear. |
sp |
Estimated smoothing parameters for the model. These are the underlying smoothing parameters, subject to optimization. |
terms |
|
Vc |
Under ML or REML smoothing parameter estimation it is possible to correct the covariance
matrix |
Ve |
Frequentist estimated covariance matrix for the parameter estimators. Particularly useful for testing whether terms are zero. Not so useful for CI's as smooths are usually biased. |
Vp |
Estimated covariance matrix for the parameters. This is a Bayesian posterior covariance matrix that results from adopting a particular Bayesian model of the smoothing process. |
weights |
Final weights used in IRLS iteration. |
y |
Response data used in the fitting process, i.e. after the first-difference transformation. |
Peter Puetz ppuetz@uni-goettingen.de
Puetz, P., Kneib, T. (2016). A Penalized Spline Estimator For Fixed Effects Panel Data Models. https://www.uni-goettingen.de/de/Puetz_03_2016/534166.html
Wood, S.N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36
. Wood, S.N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Ass. 99:673-686.
Wood S.N. (2006). Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.
# data generation: additive model with time constant indivdual fixed effects library(pamfe) id <- rep(1:50,each = 10) years <- rep(1:10,50) x1 <- runif(500) x2 <- runif(500) f1 <- sin(2 * pi * (x1 - 0.5)) ^ 2 f2 <- x2 * (1 - x2) f1_s <- f1 / sd(f1) f2_s <- f2 / sd(f2) fe <- rep(sample(1:100,50),each = 10) y <- fe + f1_s + f2_s + rnorm(500,sd = 0.5) data <- as.data.frame(cbind(id,years,y,x1,x2)) # transform data set to panel data set from type "pdata.frame" from package "plm" pdata <- pdata.frame (data, index = c("id", "years"), row.names = TRUE) # run first-difference penalized spline panel data model with generous amount of knots mod <- pam(y ~ sfe(x1,k = 40) + sfe(x2,k = 40),data = pdata) summary(mod)