[R] Indexing by logical vectors

Kingsford Jones kingsfordjones at gmail.com
Tue Jul 20 20:22:48 CEST 2010


On Mon, Jul 19, 2010 at 11:12 PM, Christian Raschke
<crasch2 at tigers.lsu.edu> wrote:
<snip>
> but in the end I have other cases, where the logical vector is
> obtained from other operations or where the value that is assigned is
> different case by case; for example,
>
> levels(something.long)[levels(something.long) %in% LETTERS[1:3]] <- "Z"
>
> So given that your general answer above to my inquiry was "No", I will
> keep experimenting and I'll also have another look at with() and
> within().
>

The with, within, transform and subset functions are wonderful, but I
agree with Christian that a symbol for self-referencing within index
brackets (whether on the LHS or RHS of the '->') makes a good wishlist
item.  The above might be written:

levels(something.long)[@@ %in% LETTERS[1:3]] <- "Z"

where '@@' would always refer to the object being indexed.  Note the
above suggests (I think -- is a binary operator like %in% a function?)
that referenced objects would also be passed to functions called
within the brackets.  E.g., the following would also work:

levels(something.long)[grep('[A-C]', @@] <- "Z"

However, given the long history of S/R, there must be logical or
philosophical obstacles to this...?

best,

Kingsford Jones

>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Christian Raschke
>> Sent: Tuesday, 20 July 2010 9:16 AM
>> To: r-help at r-project.org
>> Subject: [R] Indexing by logical vectors
>>
>> Dear R-Listers,
>>
>> My question concerns indexing vectors by logical vectors that are based
>> on the original vector. Consider the following simple example to
>> hopefully make clear what I mean:
>>
>> a <- rnorm(10)
>> a[a<0] <- NA
>>
>> However, I am now working with multiple data frames that I received,
>> where each of them has nicely descriptive, yet long names(). In my
>> scripts there are many instances where operations similar to the one
>> above are required. Again a simple example:
>>
>>
>> some.data.frame <- data.frame(some.long.variable.name=rnorm(10),
>> some.other.long.variable.name=rnorm(10))
>>
>> some.data.frame$some.other.long.variable.name[some.data.frame$some.other.long.variable.name
>> < 0] <- NA
>>
>>
>> The fact that the names are so long makes things not very readable in
>> the script and hard to debug. Is there a way in R to refer to the "self"
>> of whatever is being indexed? I am looking for something like
>>
>> some.data.frame$some.other.long.variable.name[.self < 0] <- NA
>>
>> that would accomplish the same result as above. Or is there another
>> concise, but less messy way to do this? I prefer not attaching the
>> data.frames and partial matching makes things even more messy since many
>> names() are very similar. I know I could just rename everything, but I'd
>> like to learn if there is and easy or obvious way to do this in R that I
>> have missed so far.
>>
>> I would appreciate any advice, and I apologize if this topic has been
>> discussed before.
>>
>>
>>  > sessionInfo()
>> R version 2.11.0 (2010-04-22)
>> x86_64-redhat-linux-gnu
>>
>> locale:
>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list