Re: [R] RE: more on lm(y~x) question: removing NA´s

Christoph Scherber Christoph.Scherber at uni-jena.de
Tue May 4 17:59:13 CEST 2004


Great!!! This works, many thanks!

**************
using lsfit(x,y) instead of lm(y~x)  produces a perfectly correct output.
***************




Thomas Lumley wrote:

>On Tue, 4 May 2004, Christoph Scherber wrote:
>
>  
>
>>it all works fine (the regression lines fit correctly to the data) as
>>long as there are not both missing values in j and k.
>>    
>>
>
>That's very strange.  The lines
> for (k in 1:length(foranalysis[93:174,i]))
>     number[k]_substring(plotcode[foranalysis[k,1]],1,5)
>
>should set result in k being the scalar value 81 after the loop is over.
>In R (unlike S-PLUS), loop indices are just ordinary variables in the
>environment where the loop is executed. I'd expect this code to work in
>S-PLUS but not in R.
>
>That loop is actually redundant, since substring() is vectorised:
>	number <- substring(plotcode[foranalysis[93:174,1]],1,5)
>should work just as well.
>
>It's also strange that you create a data frame df from j and k but don't
>use it in the lm() call (or AFAICS anywhere else).
>
>  
>
>>What suggestions would you have for this? Or, more precisely, how would
>>you create multiple graphs from subsequent columns of a data.frame?
>>    
>>
>
>I'd probably use lsfit. The following is obviously not tested, since I
>don't have the data (or even understand fully the data layout).
>
>L <- length(93:174)
>for(i in p) {
>	X<-foranalysis[93:174, i]
>	Y<-foranalysis[93:174, i+1]
>	corr<-cor(X,Y)
>	corrtrunc<-cor(X[X<0.9], Y[X<0.9])
>	mainlab <- paste(substring(names(foranalysis[i]), 2, 8),
>			"; corr.:", corr,
>			";excl.Mono", corrtrunc))
>        plot(X,Y,main=mainlab,
>		xlab="% of total biomass",ylab="% of total cover",pch="n")
>	number <- substring(plotcode[foranalysis[1:L,1]], 1, 5)
>	text(X, Y, number)
>	model <- lsfit(X,Y)
>	abline(model)
>	abline(0, 1, lty=2)
>    }
>
>
>	-thomas
>
>  
>
>>>>>par(mfrow=c(5,5))
>>>>>p_seq(3,122,2)
>>>>>i_0
>>>>>k_0
>>>>>number_0
>>>>>for (i in p) {
>>>>>  j_foranalysis[93:174,i+1]
>>>>>  k_foranalysis[93:174,i]
>>>>>  df_data.frame(j,k)
>>>>>  mainlab1_substring(names(foranalysis[i]),2,8)
>>>>>  mainlab2_"; corr.:"
>>>>>  mainlab3_round(cor(j,k,na.method="available"),4)
>>>>>  mainlab4_"; excl.Mono:"
>>>>>  mainlab5_round(cor(j[j<0.9],k[j<0.9],na.method="available"),4)
>>>>>  mainlab_paste(mainlab1,mainlab2,mainlab3,mainlab4,mainlab5)
>>>>>  plot(k,j,main=mainlab,xlab="% of total biomass",ylab="% of total
>>>>>cover",pch="n")
>>>>>  for (k in 1:length(foranalysis[93:174,i]))
>>>>>number[k]_substring(plotcode[foranalysis[k,1]],1,5)
>>>>>  text(foranalysis[93:174,i],foranalysis[93:174,i+1],number)
>>>>>**********************************
>>>>>  model_lm(j~k,na.action=na.exclude])
>>>>>**********************************
>>>>>  abline(model)
>>>>>  abline(0,1,lty=2)
>>>>>   }
>>>>>
>>>>>Does anyone have any suggestions on this?
>>>>>
>>>>>Best regards
>>>>>Chris.,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Liaw, Andy wrote:
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>By (`factory') default that's done for you automagically, because
>>>>>>options("na.action") is `na.omit'.
>>>>>>
>>>>>>If you really want to do it `by hand', and have the data in
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>a data frame,
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>you can use something like:
>>>>>>
>>>>>>lm(y ~ x, df[complete.cases(df),])
>>>>>>
>>>>>>HTH,
>>>>>>Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>From: Christoph Scherber
>>>>>>>
>>>>>>>Dear all,
>>>>>>>
>>>>>>>I have a data frame with different numbers of NA´s in each
>>>>>>>column, e.g.:
>>>>>>>
>>>>>>>x       y
>>>>>>>1      2
>>>>>>>NA  3
>>>>>>>NA  4
>>>>>>>4     NA
>>>>>>>1     5
>>>>>>>NA NA
>>>>>>>
>>>>>>>
>>>>>>>I now want to do a linear regression on y~x with all the NA´s
>>>>>>>removed.
>>>>>>>The problem now is that is.na(x) (and is.na(y) obviously
>>>>>>>gives vectors
>>>>>>>with different lengths. How could I solve this problem?
>>>>>>>
>>>>>>>Thank you very much for any help.
>>>>>>>
>>>>>>>Best regards
>>>>>>>Chris
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>______________________________________________
>>>>R-help at stat.math.ethz.ch mailing list
>>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>Thomas Lumley			Assoc. Professor, Biostatistics
>>>tlumley at u.washington.edu	University of Washington, Seattle
>>>
>>>
>>>
>>>      
>>>
>>    
>>
>
>Thomas Lumley			Assoc. Professor, Biostatistics
>tlumley at u.washington.edu	University of Washington, Seattle
>
>  
>




More information about the R-help mailing list