[R] problems subsetting

Martin Tomko martin.tomko at geo.uzh.ch
Thu Nov 18 15:39:42 CET 2010


Dear all,
I have searched the forums for an answer - and there is plenty of 
questions along the same line - but none of the paproaches shown worked 
to my problem:

I have a data frame that I get from a csv:

summarystats<-as.data.frame(read.csv(file=f_summary));

where I have the columns Dataset, Class, Type, Category,..
Problem1:  I want to find a subset of this frame, based on values in 
multiple columns
What I do currently is:

subset1 <- summarystats
subset1<-subset1[subset1$Class == 1,]
subset1<-subset1[subset1$Type == 1,]
subset1<-subset1[subset1$Category == 1,]

Now, this works, but is UGLY! I tried using "&&" or "&" , for isntance : 
subset1<-subset1[ (subset1$Class == 1)&& (subset1$Category == 1),]
but it returns an empty data frame.

Anyway, the main problem is
Problem2:
I have a second data frame - a square matrix (rownames == colnames), distm:

distm<-read.table(file=f_simmatrix, sep = ",");
what I want is select ONLY the columns and rows entries matching the 
above subset1:

subset2<-distm[subset1$Dataset,subset1$Dataset] returns a matrix of 
correct size, but with incorrect entries (established by visual inspection).

this is the same as:
selectedrows<-as.vector(subset1$Dataset)
subset2<-distm[selectedrows,selectedrows]

also verified using:
rownames(subset2)%in% selectedrows
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

What am I missing?

Thanks
Martin



More information about the R-help mailing list