[R] Unexpected behavior when giving a value to a new variable based on the value of another variable
peter dalgaard
pdalgd at gmail.com
Mon Sep 1 20:10:11 CEST 2014
On 01 Sep 2014, at 13:08 , Angel Rodriguez <angel.rodriguez at matiainstituto.net> wrote:
> Thank you John, Jim, Jeff and both Davids for your answers.
>
> After trying different combinations of values for the variable samplem, it looks like if age is greater than 65, R applies the correct code 1 whatever the value of samplem, but if age is less than 65, it just copies the values of samplem to sample. I do not understand why it does so.
>
It's because indexed assignment is really (white lie alert: it's actually worse)
N$sample <- `[<-`(`$`(N, `sample`), index, value)
and since N$sample isn't there from the outset, partial matching kicks in for the `$`bit and makes the right hand side equivalent to the same thing with `samplem`. The result still gets assigned to N$sample, but the value is the same that N$samplem would get from
N$samplem[N$age >= 65] <- 1
Notice the difference if you do
> N$sample <- NA
> N$sample[N$age >= 65] <- 1
> N
age samplem sample
1 67 NA 1
2 62 1 NA
3 74 1 1
4 61 1 NA
5 60 1 NA
6 55 1 NA
7 60 1 NA
8 59 1 NA
9 58 NA NA
-pd
> In any case, Jim's syntax work very well, although I do not understand why either.
>
> Answering to Jim, I just wanted a variable that could identify individuals with some characteristics (not only age, as in this example that has been oversimplified).
>
> Best regards,
>
> Angel Rodriguez-Laso
>
>
> -----Mensaje original-----
> De: John McKown [mailto:john.archie.mckown at gmail.com]
> Enviado el: vie 29/08/2014 14:46
> Para: Angel Rodriguez
> CC: r-help
> Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable
>
> On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
> <angel.rodriguez at matiainstituto.net> wrote:
>>
>> Dear subscribers,
>>
>> I've found that if there is a variable in the dataframe with a name very similar to a new variable, R does not give the correct values to this latter variable based on the values of a third value:
>>
>>
> <snip>
>>
>> Any clue for this behavior?
>>
> <snip>
>>
>> Thank you very much.
>>
>> Angel Rodriguez-Laso
>> Research project manager
>> Matia Instituto Gerontologico
>
> That is unusual, but appears to be documented in a section from
>
> ?`[`
>
> <quote>
> Character indices
>
> Character indices can in some circumstances be partially matched (see
> pmatch) to the names or dimnames of the object being subsetted (but
> never for subassignment). Unlike S (Becker et al p. 358)), R never
> uses partial matching when extracting by [, and partial matching is
> not by default used by [[ (see argument exact).
>
> Thus the default behaviour is to use partial matching only when
> extracting from recursive objects (except environments) by $. Even in
> that case, warnings can be switched on by
> options(warnPartialMatchDollar = TRUE).
>
> Neither empty ("") nor NA indices match any names, not even empty nor
> missing names. If any object has no names or appropriate dimnames,
> they are taken as all "" and so match nothing.
> </quote>
>
> Note the commend about "partial matching" in the middle paragraph in
> the quote above.
>
> --
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
>
> Maranatha! <><
> John McKown
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list