[R] subscripting in data frames with NA
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Tue Jun 24 13:11:37 CEST 2008
Agustin Lobo wrote:
> Dear list:
>
> Given
> > str(b3)
> 'data.frame': 159 obs. of 6 variables:
> $ index_pollution : num 8.228 10.513 0.549 0.915 10.416 ...
> $ position_descrip: chr "2" "2" "2" NA ...
> $ position_geo : chr "3" "0" "3" "3" ...
> $ institution : Factor w/ 3 levels "digesa","mem",..: 3 3 3 3 3 3
> 3 3 3 3 ...
> $ p_desc_no3 : chr "2" "2" "2" NA ...
> $ p_geo_no3 : chr "3" "0" "3" "3" ...
>
> I try to subscript but get:
>
> > b3[b3[,3]=="3",5] <-NA
> Error in `[<-.data.frame`(`*tmp*`, b3[, 3] == "3", 5, value = NA) :
> missing values are not allowed in subscripted assignments of data
> frames
Notice that it is not the NA on the right that is the problem, but those
in the subscript, so try
b3[b3[,3]=="3" | is.na(b3[,3]), 5] <- NA
(or ... &!is.na... if that is what you want)
> Why? What's the correct way of doing this operation?
I forget the exact reason, but as far as I remember, we allowed it at
some point, but found that behaviour was inconsistent between differnt
modes of subassignment.
> Actually, I previously tried with:
> > str(b2)
> 'data.frame': 159 obs. of 6 variables:
> $ index_pollution : num 8.228 10.513 0.549 0.915 10.416 ...
> $ position_descrip: Factor w/ 3 levels "0","1","2": 3 3 3 NA NA NA 3
> 3 3 3 ...
> $ position_geo : Factor w/ 4 levels "0","1","2","3": 4 1 4 4 3 NA
> 3 3 3 4 ...
> $ institution : Factor w/ 3 levels "digesa","mem",..: 3 3 3 3 3 3
> 3 3 3 3 ...
> $ p_desc_no3 : Factor w/ 3 levels "0","1","2": 3 3 3 NA NA NA 3
> 3 3 3 ...
> $ p_geo_no3 : Factor w/ 4 levels "0","1","2","3": 4 1 4 4 3 NA
> 3 3 3 4 ...
>
> > table(b2$p_desc_no3)
>
> 0 1 2
> 42 44 66
>
> and
>
> > levels(b2$p_desc_no3)[levels(b2$position_geo)=="3"] <- NA
>
> which does not result into error but leaves b2$p_desc_no3 unchanged:
>
I don't think this makes sense at all. It changes the 4th level of a
three-level factor???
> > table(b2$p_desc_no3)
>
> 0 1 2
> 42 44 66
>
>
> what am i doing wrong?
>
> Thanks
>
> Agus
>
>
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list