Re: [R] RE: more on lm(y~x) question: removing NA´s
Christoph Scherber
Christoph.Scherber at uni-jena.de
Tue May 4 17:59:13 CEST 2004
Great!!! This works, many thanks!
**************
using lsfit(x,y) instead of lm(y~x) produces a perfectly correct output.
***************
Thomas Lumley wrote:
>On Tue, 4 May 2004, Christoph Scherber wrote:
>
>
>
>>it all works fine (the regression lines fit correctly to the data) as
>>long as there are not both missing values in j and k.
>>
>>
>
>That's very strange. The lines
> for (k in 1:length(foranalysis[93:174,i]))
> number[k]_substring(plotcode[foranalysis[k,1]],1,5)
>
>should set result in k being the scalar value 81 after the loop is over.
>In R (unlike S-PLUS), loop indices are just ordinary variables in the
>environment where the loop is executed. I'd expect this code to work in
>S-PLUS but not in R.
>
>That loop is actually redundant, since substring() is vectorised:
> number <- substring(plotcode[foranalysis[93:174,1]],1,5)
>should work just as well.
>
>It's also strange that you create a data frame df from j and k but don't
>use it in the lm() call (or AFAICS anywhere else).
>
>
>
>>What suggestions would you have for this? Or, more precisely, how would
>>you create multiple graphs from subsequent columns of a data.frame?
>>
>>
>
>I'd probably use lsfit. The following is obviously not tested, since I
>don't have the data (or even understand fully the data layout).
>
>L <- length(93:174)
>for(i in p) {
> X<-foranalysis[93:174, i]
> Y<-foranalysis[93:174, i+1]
> corr<-cor(X,Y)
> corrtrunc<-cor(X[X<0.9], Y[X<0.9])
> mainlab <- paste(substring(names(foranalysis[i]), 2, 8),
> "; corr.:", corr,
> ";excl.Mono", corrtrunc))
> plot(X,Y,main=mainlab,
> xlab="% of total biomass",ylab="% of total cover",pch="n")
> number <- substring(plotcode[foranalysis[1:L,1]], 1, 5)
> text(X, Y, number)
> model <- lsfit(X,Y)
> abline(model)
> abline(0, 1, lty=2)
> }
>
>
> -thomas
>
>
>
>>>>>par(mfrow=c(5,5))
>>>>>p_seq(3,122,2)
>>>>>i_0
>>>>>k_0
>>>>>number_0
>>>>>for (i in p) {
>>>>> j_foranalysis[93:174,i+1]
>>>>> k_foranalysis[93:174,i]
>>>>> df_data.frame(j,k)
>>>>> mainlab1_substring(names(foranalysis[i]),2,8)
>>>>> mainlab2_"; corr.:"
>>>>> mainlab3_round(cor(j,k,na.method="available"),4)
>>>>> mainlab4_"; excl.Mono:"
>>>>> mainlab5_round(cor(j[j<0.9],k[j<0.9],na.method="available"),4)
>>>>> mainlab_paste(mainlab1,mainlab2,mainlab3,mainlab4,mainlab5)
>>>>> plot(k,j,main=mainlab,xlab="% of total biomass",ylab="% of total
>>>>>cover",pch="n")
>>>>> for (k in 1:length(foranalysis[93:174,i]))
>>>>>number[k]_substring(plotcode[foranalysis[k,1]],1,5)
>>>>> text(foranalysis[93:174,i],foranalysis[93:174,i+1],number)
>>>>>**********************************
>>>>> model_lm(j~k,na.action=na.exclude])
>>>>>**********************************
>>>>> abline(model)
>>>>> abline(0,1,lty=2)
>>>>> }
>>>>>
>>>>>Does anyone have any suggestions on this?
>>>>>
>>>>>Best regards
>>>>>Chris.,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Liaw, Andy wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>By (`factory') default that's done for you automagically, because
>>>>>>options("na.action") is `na.omit'.
>>>>>>
>>>>>>If you really want to do it `by hand', and have the data in
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>a data frame,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>you can use something like:
>>>>>>
>>>>>>lm(y ~ x, df[complete.cases(df),])
>>>>>>
>>>>>>HTH,
>>>>>>Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>From: Christoph Scherber
>>>>>>>
>>>>>>>Dear all,
>>>>>>>
>>>>>>>I have a data frame with different numbers of NA´s in each
>>>>>>>column, e.g.:
>>>>>>>
>>>>>>>x y
>>>>>>>1 2
>>>>>>>NA 3
>>>>>>>NA 4
>>>>>>>4 NA
>>>>>>>1 5
>>>>>>>NA NA
>>>>>>>
>>>>>>>
>>>>>>>I now want to do a linear regression on y~x with all the NA´s
>>>>>>>removed.
>>>>>>>The problem now is that is.na(x) (and is.na(y) obviously
>>>>>>>gives vectors
>>>>>>>with different lengths. How could I solve this problem?
>>>>>>>
>>>>>>>Thank you very much for any help.
>>>>>>>
>>>>>>>Best regards
>>>>>>>Chris
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>______________________________________________
>>>>R-help at stat.math.ethz.ch mailing list
>>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>
>>>>
>>>>
>>>>
>>>>
>>>Thomas Lumley Assoc. Professor, Biostatistics
>>>tlumley at u.washington.edu University of Washington, Seattle
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>Thomas Lumley Assoc. Professor, Biostatistics
>tlumley at u.washington.edu University of Washington, Seattle
>
>
>
More information about the R-help
mailing list