[R] Unexpected behavior when giving a value to a new variable basedon the value of another variable
David Winsemius
dwinsemius at comcast.net
Sun Aug 31 06:28:58 CEST 2014
On Aug 30, 2014, at 7:38 PM, David Winsemius wrote:
>
> On Aug 29, 2014, at 8:54 PM, David McPearson wrote:
>
>> On Fri, 29 Aug 2014 06:33:01 -0700 Jeff Newmiller <jdnewmil at dcn.davis.ca.us
>> >
>> wrote
>>
>>> One clue is the help file for "$"...
>>>
>>> ?" $"
>>>
>>> In particular there see the discussion of character indices and
>>> the "exact"
>>> argument.
>>>
>>
>> <...snip...>
>>>
>>> On August 29, 2014 1:53:47 AM PDT, Angel Rodriguez
>>> <angel.rodriguez at matiainstituto.net> wrote: >
>>>> Dear subscribers,
>>>>
>>>> I've found that if there is a variable in the dataframe with a name
>> <...sip...>
>>>>> N <- structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58),
>>>>> V2 =
>>>> c(NA, 1, 1, 1, 1,1,1,1,NA)),
>>>> + .Names = c("age","samplem"), row.names =
>>>> c(NA,
>>>> -9L), class = "data.frame")
>>>>> N$sample[N$age >= 65] <- 1
>>>>> N
>>>> age samplem sample
>>>> 1 67 NA 1
>>>> 2 62 1 1
>>>> 3 74 1 1
>>>> 4 61 1 1
>>>> 5 60 1 1
>>>> 6 55 1 1
>>>> 7 60 1 1
>>>> 8 59 1 1
>>>> 9 58 NA NA
>> <...snip...>
>>
>> Having seen all the responses about partial matching I almost
>> understand. I've
>> also replicated the behaviour on R 2.11.1 so it's been around
>> awhile. This
>> tells me it ain't a bug - so if any of the cognoscenti have the
>> time and
>> inclination can someone give me a brief (and hopefully simple)
>> explanation of
>> what is going on under the hood?
>>
>> It looks (to me) like N$sample[N$age >= 65] <- 1 copies N$samplem
>> to N$sample
>> and then does the assignment. If partial matching is the problem
>> (which it
>> clearly is) my expectation is that the output should look like
>>
>> age samplem
>> 1 67 1
>> 2 62 1
>> 3 74 1
>> 4 61 1
>> 5 60 1
>> 6 55 1
>> 7 60 1
>> 8 59 1
>> 9 58 NA
>> That is - no new column.
>> (and I just hate it when the world doesn't live up to my
>> expectations!)
>
> Not sure what you are seeing. I am seeing what you expected:
>
> > test <- data.frame(age=1:10, sample=1)
> > test$sample[test$age<5] <- 2
> > test
> age sample
> 1 1 2
> 2 2 2
> 3 3 2
> 4 4 2
> 5 5 1
> 6 6 1
> 7 7 1
> 8 8 1
> 9 9 1
> 10 10 1
I realized later that I had not constructed a test of you behavior and
that when I did I see the creation of a third column. The answer is to
read the help page:
?`[<-`
"Character indices can in some circumstances be partially matched (see
pmatch) to the names or dimnames of the object being subsetted (but
never for subassignment). "
Note the caveat in parentheses.
--
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list