[R] replacement has *** rows, data has ***

William Dunlap wdunlap at tibco.com
Tue Jun 13 00:05:57 CEST 2017


This can happen if there are rows containing missing values (NA's) in the
data used to fit the model.  Use na.action=na.exclude when fitting the
model instead of the default na.action=na.omit to make the prediction
vector line up with the input data instead of lining up with the input data
after the NA-rows have been dropped.

E.g.,

> d <- data.frame(y=1:10, x=log2(c(1,2,NA,4:10)))
> fitDefault <- lm(y ~ x, data=d)
> fitNaExclude <- lm(y ~ x, data=d, na.action=na.exclude)
> length(predict(fitDefault))
[1] 9
> length(predict(fitNaExclude))
[1] 10
> predict(fitNaExclude)
         1          2          3          4          5          6
 7          8          9         10
-0.2041631  2.4602537         NA  5.1246704  5.9824210  6.6832543
 7.2758004  7.7890872  8.2418382  8.6468378


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jun 12, 2017 at 1:32 PM, Manqing Liu <hopkins0727 at gmail.com> wrote:

> Hi all,
>
> I created a predicted variable based on a model, but somehow not all
> observations have a predicted value. When I tired to add the predicated
> value to the main data set (data$pr <- pr) , it said:
> replacement has 34333 rows, data has 34347.
> Do you know how to solve that?
>
> Thanks,
> Manqing
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list