[R] do glm with two data sets
Sundar Dorai-Raj
sundar.dorai-raj at pdf.com
Thu Aug 18 16:47:44 CEST 2005
Hu, Ying (NIH/NCI) wrote:
> Thanks for your help.
>
> # read the two data sets
> e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1))
> g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1))
>
> # solution
> d1<-data.frame(g[1,], e[1,])
> fit<-glm(e[1,] ~ g[1,], data=d1)
> summary(fit)
>
> I am not sure that is the best solution.
>
> Thanks again,
>
> Ying
>
Hi, Ying,
What's wrong with this solution? Do you still get an error? What is your
primary goal?
A couple of points:
1. It's better to use names in your data.frame:
d1 <- data.frame(g = g[1,], e = e[1,])
Then in glm:
fit <- glm(e ~ g, data = d1)
2. Also, you may just be giving us a toy example, but if you don't
specify a family argument in glm then you are simply getting the least
squares. In that case you should use ?lm instead.
HTH,
--sundar
>
> -----Original Message-----
> From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk]
> Sent: Wednesday, August 17, 2005 7:01 PM
> To: Sundar Dorai-Raj
> Cc: Hu, Ying (NIH/NCI); r-help at stat.math.ethz.ch
> Subject: Re: [R] do glm with two data sets
>
> On Wed, 2005-08-17 at 17:22 -0500, Sundar Dorai-Raj wrote:
>
>>Hu, Ying (NIH/NCI) wrote:
>>
>>>I have two data sets:
>>>File1.txt:
>>>Name id1 id2 id3 ...
>>>N1 0 1 0 ...
>>>N2 0 1 1 ...
>>>N3 1 1 -1 ...
>>>...
>>>
>>>File2.txt:
>>>Group id1 id2 id3 ...
>>>G1 1.22 1.34 2.44 ...
>>>G2 2.33 2.56 2.56 ...
>>>G3 1.56 1.99 1.46 ...
>>>...
>>>I like to do:
>>>x1<-c(0,1,0,...)
>>>y1<-c(1.22,1.34, 2.44, ...)
>>>z1<-data.frame(x,y)
>>>summary(glm(y1~x1,data=z1)
>>>
>>>But I do the same thing by inputting the data sets from the two files
>>>e <- read.table("file1.txt", header=TRUE,row.names=1)
>>>g <- read.table("file2.txt", header=TRUE,row.names=1)
>>>e1<-exp[1,]
>>>g1<-geno[1,]
>>>d1<-data.frame(g, e)
>>>summary(glm(e1 ~ g1, data=d1))
>>>
>>>the error message is
>>>Error in model.frame(formula, rownames, variables, varnames, extras,
>>>extranames, :
>>> invalid variable type
>>>Execution halted
>>>
>>>Thanks in advance,
>>>
>>>Ying
>
>
> Hi Ying,
>
> That error message is likely caused by having a data.frame on the right
> hand side (rhs) of the formula. You can't have a data.frame on the rhs
> of a formula and g1 is still a data frame even if you only choose the
> first row, e.g.:
>
> dat <- as.data.frame(matrix(100, 10, 10))
> class(dat[1, ])
> [1] "data.frame"
>
> You could try:
>
> glm(e1 ~ ., data=g1[1, ])
>
> and see if that works, but as Sundar notes, your post is a little
> difficult to follow, so this may not do what you were trying to achieve.
>
> HTH
>
> Gav
>
>
>>You have several inconsistencies in your example, so it will be
>>difficult to figure out what you are trying to accomplish.
>>
>> > e <- read.table("file1.txt", header=TRUE,row.names=1)
>> > g <- read.table("file2.txt", header=TRUE,row.names=1)
>> > e1<-exp[1,]
>>
>>What's "exp"? Also it's dangerous to use an R function as a variable
>>name. Most of the time R can tell the difference, but in some cases it
>>cannot.
>>
>> > g1<-geno[1,]
>>
>>What's "geno"?
>>
>> > d1<-data.frame(g, e)
>>
>>d1 is now e and g cbind'ed together?
>>
>> > summary(glm(e1 ~ g1, data=d1))
>>
>>Are "e1" and "g1" elements of "d1"? From what you've told us, I don't
>>know where the error is occurring. Also, if you are having errors, you
>>can more easily isolate the problem by doing:
>>
>>fit <- glm(e1 ~ g1, data = d1)
>>summary(fit)
>>
>>This will at least tell you the problem is in your call to "glm" and not
>>"summary.glm".
>>
>>--sundar
>>
>>P.S. Please (re-)read the POSTING GUIDE. Most of the time you will
>>figure out problems such as these on your own during the process of
>>creating a reproducible example.
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide!
>
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list