[R] Subsetting a data frame with multiple values and exclusions.

natalie.vanzuydam nvanzuydam at gmail.com
Wed Oct 5 17:53:20 CEST 2011


Hi all,

I realise that the convention is to provide a working example of my problem
but the data are  of a sensitive nature so I'm not able to do that in this
case.

I need to query a database for multiple search terms:

db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 
9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
"data.frame", row.names = c(NA, 
-4L))

terms_include <- c("1","2","3")
terms_exclude <- c("1.1","1.2","1.3")

So I need to write a loop where the search of each value in the list of
terms_include is searched over the entire data frame.  I thought of using
apply with grepl and subset?  At the same time if the value of terms_include
occurs in the same row as values from terms_exclude then that row must be
excluded from the output dataframe.

I'm not sure where to even begin.  I've only worked very basically with
subset.  The final database is much larger and the number of search terms is
many more than are presented here so I would really need to be able to loop
over the data frame successively to return a final df with my searched
values in at least one of the columns.

Your help and assistance is much appreciated,
Natalie



-----
Natalie Van Zuydam

PhD Student
University of Dundee
nvanzuydam at dundee.ac.uk
--
View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list