Re: [R] RE: more on lm(y~x) question: removing NA´s
Thomas Lumley
tlumley at u.washington.edu
Tue May 4 17:30:39 CEST 2004
On Tue, 4 May 2004, Christoph Scherber wrote:
> it all works fine (the regression lines fit correctly to the data) as
> long as there are not both missing values in j and k.
That's very strange. The lines
for (k in 1:length(foranalysis[93:174,i]))
number[k]_substring(plotcode[foranalysis[k,1]],1,5)
should set result in k being the scalar value 81 after the loop is over.
In R (unlike S-PLUS), loop indices are just ordinary variables in the
environment where the loop is executed. I'd expect this code to work in
S-PLUS but not in R.
That loop is actually redundant, since substring() is vectorised:
number <- substring(plotcode[foranalysis[93:174,1]],1,5)
should work just as well.
It's also strange that you create a data frame df from j and k but don't
use it in the lm() call (or AFAICS anywhere else).
>
> What suggestions would you have for this? Or, more precisely, how would
> you create multiple graphs from subsequent columns of a data.frame?
I'd probably use lsfit. The following is obviously not tested, since I
don't have the data (or even understand fully the data layout).
L <- length(93:174)
for(i in p) {
X<-foranalysis[93:174, i]
Y<-foranalysis[93:174, i+1]
corr<-cor(X,Y)
corrtrunc<-cor(X[X<0.9], Y[X<0.9])
mainlab <- paste(substring(names(foranalysis[i]), 2, 8),
"; corr.:", corr,
";excl.Mono", corrtrunc))
plot(X,Y,main=mainlab,
xlab="% of total biomass",ylab="% of total cover",pch="n")
number <- substring(plotcode[foranalysis[1:L,1]], 1, 5)
text(X, Y, number)
model <- lsfit(X,Y)
abline(model)
abline(0, 1, lty=2)
}
-thomas
> >>>
> >>>par(mfrow=c(5,5))
> >>>p_seq(3,122,2)
> >>>i_0
> >>>k_0
> >>>number_0
> >>>for (i in p) {
> >>> j_foranalysis[93:174,i+1]
> >>> k_foranalysis[93:174,i]
> >>> df_data.frame(j,k)
> >>> mainlab1_substring(names(foranalysis[i]),2,8)
> >>> mainlab2_"; corr.:"
> >>> mainlab3_round(cor(j,k,na.method="available"),4)
> >>> mainlab4_"; excl.Mono:"
> >>> mainlab5_round(cor(j[j<0.9],k[j<0.9],na.method="available"),4)
> >>> mainlab_paste(mainlab1,mainlab2,mainlab3,mainlab4,mainlab5)
> >>> plot(k,j,main=mainlab,xlab="% of total biomass",ylab="% of total
> >>>cover",pch="n")
> >>> for (k in 1:length(foranalysis[93:174,i]))
> >>>number[k]_substring(plotcode[foranalysis[k,1]],1,5)
> >>> text(foranalysis[93:174,i],foranalysis[93:174,i+1],number)
> >>>**********************************
> >>> model_lm(j~k,na.action=na.exclude])
> >>>**********************************
> >>> abline(model)
> >>> abline(0,1,lty=2)
> >>> }
> >>>
> >>>Does anyone have any suggestions on this?
> >>>
> >>>Best regards
> >>>Chris.,
> >>>
> >>>
> >>>
> >>>
> >>>Liaw, Andy wrote:
> >>>
> >>>
> >>>
> >>>>By (`factory') default that's done for you automagically, because
> >>>>options("na.action") is `na.omit'.
> >>>>
> >>>>If you really want to do it `by hand', and have the data in
> >>>>
> >>>>
> >>>a data frame,
> >>>
> >>>
> >>>>you can use something like:
> >>>>
> >>>>lm(y ~ x, df[complete.cases(df),])
> >>>>
> >>>>HTH,
> >>>>Andy
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>From: Christoph Scherber
> >>>>>
> >>>>>Dear all,
> >>>>>
> >>>>>I have a data frame with different numbers of NA´s in each
> >>>>>column, e.g.:
> >>>>>
> >>>>>x y
> >>>>>1 2
> >>>>>NA 3
> >>>>>NA 4
> >>>>>4 NA
> >>>>>1 5
> >>>>>NA NA
> >>>>>
> >>>>>
> >>>>>I now want to do a linear regression on y~x with all the NA´s
> >>>>>removed.
> >>>>>The problem now is that is.na(x) (and is.na(y) obviously
> >>>>>gives vectors
> >>>>>with different lengths. How could I solve this problem?
> >>>>>
> >>>>>Thank you very much for any help.
> >>>>>
> >>>>>Best regards
> >>>>>Chris
> >>>>>
> >>>>>
> >>>>>
> >>______________________________________________
> >>R-help at stat.math.ethz.ch mailing list
> >>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>
> >>
> >>
> >
> >Thomas Lumley Assoc. Professor, Biostatistics
> >tlumley at u.washington.edu University of Washington, Seattle
> >
> >
> >
>
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list