\name{stepExpect}
\alias{stepExpect}
\alias{stepExpect.expectreg}
%\alias{stepExpect_separately}
%\alias{stepExpect_jointly}
%- Also NEED an '\alias' for EACH other topic documented here.
\title{
Stepwise model selection for expectreg models
}
\description{
Determines the best models based on stepwise AIC or cross-validation selection for each asymmetry parameter separaterly or jointly with a grid approach.
}
\usage{
\method{stepExpect}{expectreg}(object, scope = NULL, split = c("no","complete","restricted"), 
    type = c("separately","jointly"), criterion = c("AIC","BIC","OCV","GCV","CV"), 
    k = 2, N_CV = 5, grid_alpha = 50, weight = 1, interval = c(-1, 2),
    direction = c("forward","backward","both"), delta = 1e-05, trace = FALSE, 
    lambda = 1, output_type = c("one_model","list_models"), ...)
}
%- maybe also 'usage' for other objects documented here.
\arguments{
  \item{object}{
 an object of class \code{expectreg}. Required to get the basic information: \code{data}, \code{estimate}, \code{smooth}, \code{expectiles}, ...
}
  \item{scope}{
    defines the range of models evaluated in the model selection. This could be either \code{NULL},  a single formula, or a list of two formulas specifying the upper and lower bound. See the details for how to specify the formulas and how they are used.
}
  \item{split}{
how to deal with psplines. See details.
}
  \item{type}{
how to deal with the different asymmetry parameters. Either estimate the best model for each asymmetry parameter \code{separately} or \code{jointly}.
}
  \item{criterion}{
selection criterion. Default is \code{"AIC"}, alternatives are \code{"BIC"}, \code{"OCV"}, \code{"GCV"} and \code{CV}. For \code{CV} a \code{N_CV}-fold cross-validation will be run.
}
  \item{k}{
numeric, the penalty per parameter to be used, if \code{criterion} is \code{NULL}; the default \code{k = 2} is the AIC, \code{k = log(n)} is the BIC.
}
  \item{N_CV}{
number of cross validations. Default is 5.
}
  \item{grid_alpha}{
number of asymmetry parameters used for grid approach. Grid is defined as \code{seq(1/grid_alpha,1-1/grid_alpha,1/grid_alpha)}
}
  \item{weight}{
    of specific parts of the distribution. The model selection criteria corresponding to asymmetry parameters specified in \code{interval} have a this time higher (lower) weight when building the total score. Values between 0 and Inf are possible.
}
  \item{interval}{
for possible weighting of model selection criteria. The borders of the \code{interval} are not weighted.
}
  \item{direction}{
of stepwise model selection. Possible values are \code{forward} (default), \code{backward} and \code{both}.
}
   \item{delta}{
minimal difference between old model and best new model to stop selection.
}
 \item{trace}{
 if \code{TRUE}, information of all evaluated models is printed at the end of each step.
}
  \item{lambda}{
initial value of smoothing parameter for updating \code{expectreg.ls()}. Default is 1.
}
\item{output_type}{
specifies which output should be created. Default is \code{"list_models"}. See value for details.
}
  \item{...}{
}
}
\details{
Stepwise model selection for each asymmetry parameter separately or jointly. Either a criterion based procedure is applied (AIC, BIC, OCV or GCV), or a N_CV-fold cross-validation is run (CV). For the joint approach CV is equivalent to the scoring approach.


If \code{scope} is \code{NULL}, then the lower bound is the intercept model and the upper bound is the model given in \code{object}. If \code{code} is a single formula then this is the upper bound and the lower bound is the intercept model. If \code{scope} is a list of two formulae then the first is the upper bound and the second is the lower bound. \cr

For P-splines several approaches for model selection are possible:
\itemize{
    \item{\code{split = "no"} { No special treatment is applied. The input is only checked for inconsitency. Each covariate may only occure once, except if it is the combination of "parametric" and "penalizedpart_pspline". Then both covariates are treated a independent covariates.}}
    \item{\code{split = "restricted"} { For every P-spline two possibilities are evaluated in each step and the better one is used. The two possibilities are the classical linear covariate and the classical P-spline. If one is selected the otherone is not possible anymore. Thus only possible for forward selection.}}
    \item{\code{split = "complete"} { Each P-spline is splitted into its linear trend (\code{rb(..., type = 'parametric')}) and the wiggly deviation of the linear trend \code{rb(..., type = 'penalizedpart_pspline')}. Both parts are used as independent covariates in the model selection. }}
    }
}
\value{
An object of class \code{expectreg_selected}. \cr
A LIST with elements specified in \code{output_type} and anova:
  \item{anova }{ A LIST of model selection information.
                \itemize{
                \item{A LIST for each asymmetry containing a LIST of for each iteration:
                \describe{
                \item{\code{StartModel}}{Formula of the initial model of this step}
                \item{\code{StartCriterion}}{Model selection criterion corresponding to StartModel}
                \item{\code{names_covariates}}{Names of covariates under evaluation (forward/backward)}
                \item{\code{names_covariates_excl}}{Names of covariates possible to exclude (both)}
                \item{\code{names_covariates_incl}}{Names of covariates possible to include again (both)}
                \item{\code{criterion_covariates}}{Model selection criterion corresponding to the models with/without the covariates specified in names_covariates. (forward/backward)}
                \item{\code{criterion_covariates_excl}}{Model selection criterion corresponding to the models without the covariates specified in names_covariates_excl. (both)}
                \item{\code{criterion_covariates_incl}}{Model selection criterion corresponding to the models with the covariates specified in names_covariates_incl. (both)}
                \item{\code{best_model}}{Formula of selected model of this step.}
                }
                }
                \item{
                For all asymmetries and all steps together the following elements are added to the first level of the anova:
                \describe{
                \item{\code{names_selection}}{Names of covariates possible to be excluded.}
                \item{\code{names_fixed}}{Names of covariates always in the best model (lower bound of scope).}
                \item{\code{table_selected}}{matrix (rows = covariate, column = asymmetries) of selected covariates. 1 means selected 0 means excluded. }
                }
                }
                }
                }
    \item{one_model }{Joint best model for all asymmetry parameters. (For separate approach: covariates which are always excluded are not shown, covariates which are included only in some asymmetries have coefficient 0 in the others.)}
    \item{list_models }{ LIST of best model for each asymmetry parameter separately.}
  }
\references{
To be published
}
\author{
Elmar Spiegel
}
%\note{
%%  ~~further notes~~
%}

%% ~Make other sections like Warning with \section{Warning }{....} ~
%%\section{Warning }{\code{output_type = IndexBestModel} needs a lot saving space and is only recommended for small simulations }

\seealso{
 \code{\link{garroteExpect}}, \code{\link{AIC}}, \code{\link{OCV}}, \code{\link{GCV}}, \code{\link{CV_Score}}
}
\examples{
### Separately
set.seed(1)
x1 <- runif(1000)
x2 <- runif(1000)
x3 <- runif(1000)
eps <- rnorm(1000)

y <- 3*x1 + 0.75*x2 + eps

data1 <- data.frame(y, x1, x2, x3)

# standard model
model1 <- expectreg.ls(y ~ x1 + x2 + x3, data = data1, expectiles = c(0.1,0.5,0.9))

# Different ways to define criterion and different directions
selected1 <- stepExpect(model1, trace = TRUE, criterion = "AIC", type="separately")
selected2 <- stepExpect(model1, trace = TRUE, k = 2, type="separately")
selected3 <- stepExpect(model1, trace = TRUE, criterion = "AIC",
                                direction = "backward", type="separately")
selected4 <- stepExpect(model1, trace = TRUE, criterion = "CV", N_CV = 10, type="separately")

# Use different scope
model2 <- expectreg.ls(y ~ 1, data = data1, expectiles = c(0.1,0.5,0.9),)
scopeA <- y ~ x1 + x2 + x3
scopeB <- list(y ~ x1 + x3, y ~ x3)
scopeC <- list(y ~ x1 + x2, y ~ x3)

selectedA <- stepExpect(model2, scope = scopeA, trace = TRUE, type="separately")
selectedB <- stepExpect(model2, scope = scopeB, trace = TRUE, type="separately")
selectedC <- stepExpect(model2, scope = scopeC, trace = TRUE, type="separately")

# Use different selection types for p-splines
scope_rb1 <- y ~ rb(x1, type = 'pspline') + rb(x2, type = 'pspline') + 
                 rb(x3, type = 'pspline')

selected_rb1 <- stepExpect(model2, scope = scope_rb1, trace = TRUE, type="separately", 
                        split = 'no')
selected_rb2 <- stepExpect(model2, scope = scope_rb1, trace = TRUE, type="separately",
                        split = 'complete')
selected_rb3 <- stepExpect(model2, scope = scope_rb1, trace = TRUE, type="separately",     
                        split = 'restricted')

### Jointly
# AIC area
model1 <- expectreg.ls(y ~ x1 + x2 + x3, data = data1, expectiles = c(0.1,0.5,0.9))

selected1 <- stepExpect(model1, trace = TRUE, output_type = c("list_models","one_model"), 
                      direction = "forward", grid_alpha = 26, criterion = "AIC", 
                      type = "jointly")

# Scoring                      
selected2 <- stepExpect(model1, trace = TRUE, output_type = c("list_models","one_model"), 
                      direction = "forward", grid_alpha = 26, criterion = "CV", 
                      type = "jointly")

}

% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{ ~kwd1 }
\keyword{ ~kwd2 }% __ONLY ONE__ keyword per line
