[R] Subset a list

Marc Schwartz (via MN) mschwartz at mn.rr.com
Tue May 23 00:10:03 CEST 2006


On Mon, 2006-05-22 at 17:55 -0400, Doran, Harold wrote:
> I have a data frame of ~200 columns and ~20,000 rows where each column
> consists of binary responses (0,1) and a 9 for missing data. I am
> interested in finding the columns for which there are fewer than 100
> individuals with responses of 0. 
> 
> I can use an apply function to generate a table for each column, but I'm
> not certain whether I can subset a list based on some criterion as
> subset() is designed for vectors, matrices or dataframes.
> 
> For example, I can use the following:
> tt <- apply(data, 2, table)
> 
> Which returns an object of class list. Here is some sample output from
> tt
> 
> $R0235940b
> 
>     0     1     9 
>  2004  1076 15361 
> 
> $R0000710a
> 
>     0     9 
>     2 18439 
> 
> $R0000710b
> 
>     0     1     9 
>  3333  3941 11167 
> 
> tt$R0000710a meets my criteria and I would want to be able to easily
> find this instead of rolling through the entire output. Is there a way
> to subset this list to identify the columns which meet the criteria I
> note above?
> 
> 
> Thanks,
> Harold

Harold,

How about this:

> DF
   V1 V2 V3 V4 V5
1   0  1  0  1  0
2   0  0  1  0  1
3   0  0  1  1  0
4   1  1  0  0  1
5   1  1  1  1  0
6   0  1  0  1  1
7   0  1  1  1  0
8   0  1  0  0  0
9   0  0  1  1  0
10  1  0  0  1  1

# Find the columns with <5 0's
> which(sapply(DF, function(x) sum(x == 0)) < 5)
V2 V4
 2  4


So in your case, just replace the DF with your data frame name and the 5
with 100.

HTH,

Marc Schwartz



More information about the R-help mailing list