[R] Searching for specific values in a matrix
Steve Lianoglou
mailinglist.honeypot at gmail.com
Tue Jul 21 21:49:23 CEST 2009
On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:
> I understand your explanation about the test for even numbers.
> However I am still a bit confused as to how to go about finding a
> particular value. Here is an example data set
>
> col # attr1 attr2 attr 3 LON LAT
> 17209 D NA NA -122.9409 38.27645
> 17210 BC NA NA -122.9581 38.36304
> 17211 B NA NA -123.6851 41.67121
> 17212 BC NA NA -123.0724 38.93073
> 17213 C NA NA -123.7240 41.84403
> 17214 <NA> 464 NA -122.9430 38.30988
> 17215 C NA NA -123.4442 40.65369
> 17216 BC NA NA -122.9389 38.31551
> 17217 C NA NA -123.0747 38.97998
> 17218 C NA NA -123.6580 41.59610
> 17219 C NA NA -123.4513 40.70992
> 17220 C NA NA -123.0901 39.06473
> 17221 BC NA NA -123.0653 38.94845
> 17222 BC NA NA -122.9464 38.36808
> 17223 <NA> 464 NA -123.0143 38.70205
> 17224 <NA> NA 5 -122.8609 37.94137
> 17225 <NA> NA 5 -122.8628 37.95057
> 17226 <NA> NA 7 -122.8646 37.95978
For future reference, perhaps paste this in a way that's easy for us
to paste into a running R session so we can use it, like so:
df <- data.frame(
coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217,
17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),
attr1
=
c
("D
","BC","B","BC","C",NA,"C","BC","C","C","C","C","BC","BC",NA,NA,NA,NA),
attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
LON
=
c
( -122.9409
,-122.9581
,-123.6851
,-123.0724
,-123.7240
,-122.9430
,-123.4442
,-122.9389
,-123.0747
,-123.6580
,-123.4513
,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),
LAT
=
c
(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978
))
> If I wanted to find the row with Lat = 37.95978
Using an "indexing vector":
R> lats <- df$LAT == 37.95978
# or with the %~% from before:
# lats <- df$LAT %~% 37.95978
R> df[lats,]
coln attr1 attr2 attr3 LON LAT
18 17226 <NA> NA 7 -122.8646 37.95978
Using the "subset" function:
R> subset(df, LAT == 37.95978)
coln attr1 attr2 attr3 LON LAT
18 17226 <NA> NA 7 -122.8646 37.95978
> , how would i do that? How would I find the rows with BC?
R> subset(df, attr1 == 'BC')
coln attr1 attr2 attr3 LON LAT
2 17210 BC NA NA -122.9581 38.36304
4 17212 BC NA NA -123.0724 38.93073
8 17216 BC NA NA -122.9389 38.31551
13 17221 BC NA NA -123.0653 38.94845
14 17222 BC NA NA -122.9464 38.36808
If you try with an "indexing vector" the NA's will trip you up:
R> df[df$attr1 == 'BC',]
coln attr1 attr2 attr3 LON LAT
2 17210 BC NA NA -122.9581 38.36304
4 17212 BC NA NA -123.0724 38.93073
NA NA <NA> NA NA NA NA
8 17216 BC NA NA -122.9389 38.31551
13 17221 BC NA NA -123.0653 38.94845
14 17222 BC NA NA -122.9464 38.36808
NA.1 NA <NA> NA NA NA NA
NA.2 NA <NA> NA NA NA NA
NA.3 NA <NA> NA NA NA NA
NA.4 NA <NA> NA NA NA NA
So you could do something like:
> df[df$attr1 == 'BC' & !is.na(df$attr1),]
coln attr1 attr2 attr3 LON LAT
2 17210 BC NA NA -122.9581 38.36304
4 17212 BC NA NA -123.0724 38.93073
8 17216 BC NA NA -122.9389 38.31551
13 17221 BC NA NA -123.0653 38.94845
14 17222 BC NA NA -122.9464 38.36808
HTH,
-steve
--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the R-help
mailing list