[R] Unexpected behavior when giving a value to a new variable based on the value of another variable

Fri Aug 29 10:53:47 CEST 2014

Dear subscribers,

I've found that if there is a variable in the dataframe with a name very similar to a new variable, R does not give the correct values to this latter variable based on the values of a third value:

> M <- structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58)),.Names = c("age"), row.names = c(NA, -9L), 
+                class = "data.frame")
> M$sample[M$age >= 65] <- 1 
> M
  age sample
1  67      1
2  62     NA
3  74      1
4  61     NA
5  60     NA
6  55     NA
7  60     NA
8  59     NA
9  58     NA
> N <- structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 = c(NA, 1, 1, 1, 1,1,1,1,NA)), 
+                     .Names = c("age","samplem"), row.names = c(NA, -9L), class = "data.frame")
> N$sample[N$age >= 65] <- 1 
> N
  age samplem sample
1  67      NA      1
2  62       1      1
3  74       1      1
4  61       1      1
5  60       1      1
6  55       1      1
7  60       1      1
8  59       1      1
9  58      NA     NA

Any clue for this behavior?

My specifications:

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252    LC_MONETARY=Spanish_Spain.1252
[4] LC_NUMERIC=C                   LC_TIME=Spanish_Spain.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] foreign_0.8-61

loaded via a namespace (and not attached):
[1] tools_3.1.1

Thank you very much.

Angel Rodriguez-Laso
Research project manager
Matia Instituto Gerontologico

	[[alternative HTML version deleted]]