[R] problems subsetting
David Winsemius
dwinsemius at comcast.net
Thu Nov 18 18:12:10 CET 2010
On Nov 18, 2010, at 10:42 AM, David Winsemius wrote:
>
> On Nov 18, 2010, at 10:25 AM, Martin Tomko wrote:
>
>> Hi Gerrit,
>> indeed, that works. Excellent tip!
>>
>> For reference, I did this:
>>
>> subset1<-subset(summarystats,(Type==1)&(Class==1)&(Category==1))
>>
>> I am still not totally sure when one uses "&" amd when "&&" - I
>> was under the impression that && stands for logical AND....
>
> Both stand for logical AND. "&" is used for vectorized comparisons,
> while "&&" will only compare the first elements of the two sides
> (usually, but apparently not always) with a warning if there are
> longer objects than expected.
A little bird (actually more like an eagle in these parts) has
suggested that I mention that the reason for two different types of
logical operators is not just for confusing the unwary, but rather
because the "&&"/"||" versions will not evaluate its second argument
if its first argument is TRUE. Since this form is mostly used within
the if( ... && ... ){} else{} construction, there can be increased
efficiency when the second argument is an involved function. It won't
need to be evaluated if the first argument to "&&" is FALSE or the
first to "||" is TRUE.
--
David.
>
> > c(1,0,1,0,1) & c(0,0,1,1,-1)
> [1] FALSE FALSE TRUE FALSE TRUE
>
> > c(1,0,1,0,1) && c(0,0,1,1,-1)
> [1] FALSE
>
> > c(1,0,1,0,1) && c(1,0,1,1,-1)
> [1] TRUE
>
> --
> David.
>
>>
>> Thanks a lot.
>>
>>
>> Martin
>>
>> On 11/18/2010 3:58 PM, Gerrit Eichner wrote:
>>> Hello, Martin,
>>>
>>> as to your first problem, look at function subset(), and
>>> particularly at its argument "subset".
>>>
>>> HTH,
>>>
>>> Gerrit
>>>
>>>
>>> On Thu, 18 Nov 2010, Martin Tomko wrote:
>>>
>>>> Dear all,
>>>> I have searched the forums for an answer - and there is plenty of
>>>> questions along the same line - but none of the paproaches shown
>>>> worked to my problem:
>>>>
>>>> I have a data frame that I get from a csv:
>>>>
>>>> summarystats<-as.data.frame(read.csv(file=f_summary));
>>>>
>>>> where I have the columns Dataset, Class, Type, Category,..
>>>> Problem1: I want to find a subset of this frame, based on values
>>>> in multiple columns
>>>> What I do currently is:
>>>>
>>>> subset1 <- summarystats
>>>> subset1<-subset1[subset1$Class == 1,]
>>>> subset1<-subset1[subset1$Type == 1,]
>>>> subset1<-subset1[subset1$Category == 1,]
>>>>
>>>> Now, this works, but is UGLY! I tried using "&&" or "&" , for
>>>> isntance : subset1<-subset1[ (subset1$Class == 1)&&
>>>> (subset1$Category == 1),]
>>>> but it returns an empty data frame.
>>>>
>>>> Anyway, the main problem is
>>>> Problem2:
>>>> I have a second data frame - a square matrix (rownames ==
>>>> colnames), distm:
>>>>
>>>> distm<-read.table(file=f_simmatrix, sep = ",");
>>>> what I want is select ONLY the columns and rows entries matching
>>>> the above subset1:
>>>>
>>>> subset2<-distm[subset1$Dataset,subset1$Dataset] returns a matrix
>>>> of correct size, but with incorrect entries (established by
>>>> visual inspection).
>>>>
>>>> this is the same as:
>>>> selectedrows<-as.vector(subset1$Dataset)
>>>> subset2<-distm[selectedrows,selectedrows]
>>>>
>>>> also verified using:
>>>> rownames(subset2)%in% selectedrows
>>>> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>> FALSE FALSE
>>>> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>> FALSE FALSE
>>>> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>> FALSE FALSE
>>>> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>>
>>>> What am I missing?
>>>>
>>>> Thanks
>>>> Martin
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> AOR Dr. Gerrit Eichner Mathematical Institute, Room
>>> 212
>>> gerrit.eichner at math.uni-giessen.de Justus-Liebig-University
>>> Giessen
>>> Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen,
>>> Germany
>>> Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/
>>> eichner
>>> ---------------------------------------------------------------------
>>>
>>
>>
>> --
>> Martin Tomko
>> Postdoctoral Research Assistant
>>
>> Geographic Information Systems Division
>> Department of Geography
>> University of Zurich - Irchel
>> Winterthurerstr. 190
>> CH-8057 Zurich, Switzerland
>>
>> email: martin.tomko at geo.uzh.ch
>> site: http://www.geo.uzh.ch/~mtomko
>> mob: +41-788 629 558
>> tel: +41-44-6355256
>> fax: +41-44-6356848
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list