Hello Jeff,
thanks a lot for your help. It seems to work well now.
Greetings
Birgit
Am 28.09.2007 um 20:33 schrieb Jeffrey Robert Spies:
> Hi Birgit,
>
> I've updated the recipe here, including a change to the
> dissimilarity function (making it more efficient):
>
> http://www.r-cookbook.com/node/40
>
> You'll notice the change is:
>
> dissimilar <- function(tRow){
> (sum(tRow==FALSE, na.rm=TRUE) + sum(is.na(tRow)))/length(tRow)
> }
>
> It's actually about 40% faster to use sums instead of sub-setting
> the lists and using lengths (but the speed increase will only be
> noticeable on very, very, very long lists).
>
> --Jeff.
>
> On Sep 28, 2007, at 12:47 PM, Birgit Lemcke wrote:
>
>> Thanks a lot for both solutions of my problem.
>>
>> I tried it immediately and I understood how they are working.
>>
>> The next problem for me is now to deal with the NAs. I thought
>> perhaps it is possible to exclude the variable from the row
>> comparison if in one of the rows is an NA?
>> Furthermore it would be useful than to divide the resulting number
>> by the number of used variables for the comparison to get back a
>> number between 0 and 1.
>>
>> Unfortunately I am able to understand what happens if somebody
>> gives me the code but I am not able at the moment to write it by
>> myself. I hope this will change by and by.
>>
>> So I would be very pleased if you could help me once again.
>>
>> Greetings
>>
>> Birgit
>>
>>
>> Am 28.09.2007 um 18:25 schrieb Jeffrey Robert Spies:
>>
>>> Not sure how you want to handle the NAs, but you could try the
>>> following:
>>>
>>> #start
>>> MalVar29_37 <- read.table(textConnection("V1 V2 V3 V4 V5 V6 V7 V8 V9
>>> 0 0 0 0 0 1 0 0 0
>>> 0 0 0 0 0 1 0 0 0
>>> 0 0 0 0 0 1 0 0 0
>>> NA NA NA NA NA NA NA NA NA
>>> 0 1 0 0 0 1 0 0 0"), header=TRUE)
>>>
>>> FemVar29_37 <- read.table(textConnection(" V1 V2 V3 V4 V5 V6 V7
>>> V8 V9
>>> 1 1 0 0 0 0 0 0 0
>>> 0 1 0 0 1 1 0 0 0
>>> 1 0 0 1 0 0 0 0 0
>>> 0 1 0 0 1 0 0 0 0
>>> 0 1 0 0 0 0 0 0 0"), header=TRUE)
>>>
>>> comparison <- MalVar29_37 == FemVar29_37
>>>
>>> dissimilar <- function(tRow){
>>> length(tRow[tRow==FALSE])
>>> }
>>>
>>> dissimilarity <- apply(comparison, c(1), dissimilar)
>>> dissimilarity
>>> # finish
>>>
>>> Variable comparison is an entry by entry comparison, resulting in
>>> values of TRUE or FALSE. I've defined a function dissimilar as the
>>> number of FALSEs in a given object (tRow). Variable
>>> dissimilarity is
>>> then the application of this dissimilar function for each row of
>>> comparison. In this example, 0 means all of the entries in a row
>>> matche, 9 means none of them matched. You can see the solution here
>>> in recipe form: http://www.r-cookbook.com/node/40
>>>
>>> Hope this helps,
>>>
>>> Jeff.
>>>
>>> On Sep 28, 2007, at 11:13 AM, Birgit Lemcke wrote:
>>>
>>>> Hello!
>>>>
>>>> I am R beginner and I have a question obout a simple matching.
>>>>
>>>> I have to datasets that i read in with:
>>>>
>>>> MalVar29_37<-read.table("MalVar29_37.csv", sep = ";")
>>>> FemVar29_37<-read.table("FemVar29_37.csv", sep = ";")
>>>>
>>>> They look like this and show binary variables:
>>>>
>>>> V1 V2 V3 V4 V5 V6 V7 V8 V9
>>>> 1 0 0 0 0 0 1 0 0 0
>>>> 2 0 0 0 0 0 1 0 0 0
>>>> 3 0 0 0 0 0 1 0 0 0
>>>> 4 NA NA NA NA NA NA NA NA NA
>>>> 5 0 1 0 0 0 1 0 0 0
>>>>
>>>> V1 V2 V3 V4 V5 V6 V7 V8 V9
>>>> 1 1 1 0 0 0 0 0 0 0
>>>> 2 0 1 0 0 1 1 0 0 0
>>>> 3 1 0 0 1 0 0 0 0 0
>>>> 4 0 1 0 0 1 0 0 0 0
>>>> 5 0 1 0 0 0 0 0 0 0
>>>>
>>>> each with 348 rows.
>>>>
>>>> I would like to perform a simple matching but only row 1
>>>> compared to
>>>> row1, row 2 compared to row 2 (paired).......giving back a
>>>> number as
>>>> dissimilarity for each comparison.
>>>>
>>>> How can i do that?
>>>>
>>>> Thanks in advance
>>>>
>>>> Birgit
>>>>
>>>>
>>>>
>>>>
>>>> Birgit Lemcke
>>>> Institut für Systematische Botanik
>>>> Zollikerstrasse 107
>>>> CH-8008 Zürich
>>>> Switzerland
>>>> Ph: +41 (0)44 634 8351
>>>> birgit.lemcke@systbot.uzh.ch
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>> guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> Birgit Lemcke
>> Institut für Systematische Botanik
>> Zollikerstrasse 107
>> CH-8008 Zürich
>> Switzerland
>> Ph: +41 (0)44 634 8351
>> birgit.lemcke@systbot.uzh.ch
>>
>>
>>
>>
>>
>
Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
birgit.lemcke@systbot.uzh.ch
[[alternative HTML version deleted]]