Index: lm.Rd =================================================================== --- lm.Rd (revision 81416) +++ lm.Rd (working copy) @@ -33,7 +33,9 @@ typically the environment from which \code{lm} is called.} \item{subset}{an optional vector specifying a subset of observations - to be used in the fitting process.} + to be used in the fitting process. (See additional details about how + this argument interacts with data-dependent bases in the + \sQuote{Details} section of the \code{\link{model.frame}} documentation.) \item{weights}{an optional vector of weights to be used in the fitting process. Should be \code{NULL} or a numeric vector. Index: model.frame.Rd =================================================================== --- model.frame.Rd (revision 81416) +++ model.frame.Rd (working copy) @@ -38,7 +38,9 @@ \item{subset}{a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector (see \code{\link{[.data.frame}}) for the rows of \code{data} or if that is not - supplied, a data frame made up of the variables used in \code{formula}.} + supplied, a data frame made up of the variables used in + \code{formula}. (See additional details about how this argument + interacts with data-dependent bases under \sQuote{Details} below.) \item{na.action}{how \code{NA}s are treated. The default is first, any \code{na.action} attribute of \code{data}, second @@ -103,6 +105,12 @@ character variable is found, it is converted to a factor (as from \R 2.10.0). + Because variables in the formula are evaluated before rows are dropped + based on \code{subset}, the characteristics of data-dependent bases + such as orthogonal polynomials (i.e. from terms using + \code{\link{poly}}) or splines will be computed based on the full data + set rather than the subsetted data set. + Unless \code{na.action = NULL}, time-series attributes will be removed from the variables found (since they will be wrong if \code{NA}s are removed).