Hello James,
first I have to thank you for your help but there are some things I
don´t understand now.
I am not sur if I understand what this example gives me back:
ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c
(1,0,0,1), att3 = c(0,1,1,1))
ratings
id att1 att2 att3
1 1 1 1 0
2 2 1 0 1
3 3 0 0 1
4 4 1 1 1
tab <- crossprod(as.matrix(ratings[,-1]))
tab <- tab - diag(diag(tab))
tab
att1 att2 att3
att1 0 2 2
att2 2 0 1
att3 2 1 0
As I understood it gives me the number how often we find the same
value for example comparing att1 and att2 for all id´s?. Is that right?
What is this line doing: tab <- tab - diag(diag(tab))
And what does the original output of crosspod mean:
att1 att2 att3
att1 3 2 2
att2 2 2 1
att3 2 1 3
I tried to do this with a part of my dataset
I used a table with 3 variables (only binary)
In the first part of the table I have the females (348 rows) and in
the second part the males (also 348 rows).
Then I tried this:
CrossFemMal1_3<-crossprod(as.matrix(CrossFemMalVar1_3))
The output:
CrossFemMal1_3
V1 V2 V3
V1 NA NA NA
V2 NA NA NA
V3 NA NA NA
There was one row of NAs in my dataset. I presume this is responsible
for the NA results? So how can I deal here with NAs?
If I use two matrices (male and female) I get back amongst others the
comparison of att1male to att1 female. In the case that I use the
possibility of a percentage table output I get for example 40%. Can I
say then that if the percentage is lower than 50% the attributes are
significantly different?
Corresponding to your other suggestion:
sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0, 1, 0))
It gives me this output
1 2 3
[1,] 1 0 0
[2,] 1 0 0
[3,] 1 0 0
[4,] 1 0 0
[5,] 1 0 0
[6,] 1 0 0
[7,] 1 0 0
[8,] 1 0 0
[9,] 0 1 0
. . . .
. . . .
I think now I should count the ones for 1, 2 and 3?
I tried to use table but it gives me only the counts for 1 and zero:
table(FemV1Test)
FemV1Test
0 1
657 387
How can I specify that it gives me the counts for every column?
And then do the same for MalV1 and compare both somehow?
Another time thanks in advance for your help.
Greetings
Birgit
Am 29.09.2007 um 14:45 schrieb James Reilly:
>
> Hi Birgit,
>
> The first argument to regexpr should be just one character value,
> not a vector. Your call:
> regexpr(c("1","2","3"),FemV1)
> seems to have been interpreted as:
> regexpr("1",FemV1)
>
> I think you probably need something more like:
> sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0, 1,
> 0))
> This will also work on multiple response data such as
> FemV1 <- c("13", "2", "13", "123", "1", "23")
> Then colSums will give you frequency counts for each attribute.
>
> I think you would need greatly simplify the multiple response data
> to apply anything like a paired t-test. Have you considered just
> crosstabulating the attributes of male plants versus the females?
> For some R code, see
> https://stat.ethz.ch/pipermail/r-help/2007-February/126125.html
>
> Regards,
> James
>
>
> On 29/9/07 3:37 AM, Birgit Lemcke wrote:
>> Hello James,
>> sorry that I have to ask you a second time but I don´t understand
>> what regexpr () is doing and how the syntax works.
>> I have a vectors that I converted to character string
>> as.character(FalV1)
>> [1] "1" "1" "1" "1" "1" "1" "1" "1" "2"
>> after that I did this, but without knowing what I am really doing
>> regexpr(c("1","2","3"),FemV1)
>> The output looked like that
>> [1] 1 1 1 1 1 1 1 1 -1 As i undertsood the function looks
>> for in this case 1, 2 or 3. If there is a match it gives me back 1
>> if not it gives me back -1
>> But I don´t know how this helps me now si I hope you will explain me.
>> And there is another problem I have. cor the continous variables I
>> used a paired T-Test can I perform this approach also paired?
>> Thanks a lot in advance.
>> Greetings
>> Birgit
>> Am 21.09.2007 um 11:38 schrieb James Reilly:
>>>
>>> If I understand you right, you have several multiple response
>>> variables (with the responses encoded in numeric strings) and you
>>> want to see whether these are associated with sex. To tabulate
>>> the data, I would convert your variables into collections of
>>> dummy variables using regexpr(), then use table(). You can use a
>>> modified chi-squared test with a Rao-Scott correction on the
>>> resulting tables; see Thomas and Decady (2004). Bootstrapping is
>>> another possible approach.
>>>
>>> @article{,
>>> Author = {Thomas, D. Roland and Decady, Yves J.},
>>> Journal = {International Journal of Testing},
>>> Number = {1},
>>> Pages = {43 - 59},
>>> Title = {Testing for Association Using Multiple Response Survey
>>> Data: Approximate Procedures Based on the Rao-Scott Approach.},
>>> Volume = {4},
>>> Year = {2004},
>>> Url=http://search.ebscohost.com/login.aspx?
>>> direct=true&db=pbh&AN=13663214&site=ehost-live >> search.ebscohost.com/login.aspx?
>>> direct=true&db=pbh&AN=13663214&site=ehost-live>
>>> }
>>>
>>> Hope this helps,
>>> James
>>> --
>>> James Reilly
>>> Department of Statistics, University of Auckland
>>> Private Bag 92019, Auckland, New Zealand
>>>
>>> On 21/9/07 7:14 AM, Birgit Lemcke wrote:
>>>> First thanks for your answer.
>>>> Now I try to explain better:
>>>> I have species in the rows and morphological attributes in the
>>>> columns coded by numbers (qualitative variables; nominal and
>>>> ordinal).
>>>> In one table for the male plants of every species and in the
>>>> other table for the female plants of every species. The
>>>> variables contain every possible occurrence in this species and
>>>> this gender.
>>>> I would like to compare every variable between male and female
>>>> plants for example using a ChiSquare Test.
>>>> The Null-hypothesis could be: Variable male is equal to variable
>>>> Female.
>>>> The question behind all is, if male and female plants in this
>>>> species are significantly different and which attributes are
>>>> responsible for this difference.
>>>> I really hope that this is better understandable. If not please
>>>> ask.
>>>> Thanks a million in advance.
>>>> Greetings
>>>> Birgit
>>>
>> Birgit Lemcke
>> Institut für Systematische Botanik
>> Zollikerstrasse 107
>> CH-8008 Zürich
>> Switzerland
>> Ph: +41 (0)44 634 8351
>> birgit.lemcke@systbot.uzh.ch
Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
birgit.lemcke@systbot.uzh.ch
[[alternative HTML version deleted]]