[R] correcting a few data in a large data frame
David Winsemius
dwinsemius at comcast.net
Tue Jun 1 00:28:22 CEST 2010
On May 31, 2010, at 5:29 PM, Mr. Natural wrote:
>
> The data frame is lwf that records the survival of bushes over an 8
> year
> period. Years are called bouts. Dead bushes are recorded as zeros,
> and live
> bushes as "1."
> str(lwf)
> 'data.frame': 638 obs. of 9 variables:
> $ bushno: int 1 2 3 4 5 6 7 8 9 10 ...
> $ bout1 : int 0 1 0 1 1 1 0 1 0 1 ...
> $ bout2 : int 0 1 0 0 0 0 0 0 0 1 ...
> $ bout3 : int 0 1 0 0 0 0 0 0 0 1 ...
> $ bout4 : int 0 1 0 0 0 0 0 0 0 0 ...
> $ bout5 : int 0 1 0 0 0 0 0 0 0 0 ...
> $ bout6 : int 0 1 0 0 0 0 0 0 0 0 ...
> $ bout7 : int 0 1 0 0 0 0 0 0 0 0 ...
> $ bout8 : int 0 1 0 0 0 0 0 0 0 0 ...
>
> head(lwf)
> bushno bout1 bout2 bout3 bout4 bout5 bout6 bout7 bout8
> 1 1 0 0 0 0 0 0 0 0
> 2 2 1 1 1 1 1 1 1 1
> 3 3 0 0 0 0 0 0 0 0
> 4 4 1 0 0 0 0 0 0 0
> 5 5 1 0 0 0 0 0 0 0
> 6 6 1 0 0 0 0 0 0 0
>
> A number of the data are incorrect. For example, that for bush 145
> in year
> three is recorded as dead="0"
> when it should be alive ="1." The bushes do not come back to life
> after
> they die.
>
>> lwf[lw$bushno==145,]
> bushno bout1 bout2 bout3 bout4 bout5 bout6 bout7 bout8
> 144 145 1 1 0 1 1 1 1 1
>
>
> I know that I can do this with fix(lwf) or edit(lwf). However, I
> would like
> to learn some more R.
> What code could I use to correct these data?
rle is a function that records lengths of runs and values. Your
problem is to find rows where the length of the rle encoded data is
more than two. Perhaps something like:
apply(lwf[ , -1], 1, function(x){ length( rle(x)$values ) >2 } )
>
> I have been screwing around with such as
> lwfb[(lwf$bushno==145) & (lwf$bout3==0),0]<- lwf[(lwf$bushno==145) &
> (lwf$bout3==0),1]
> to no avail.
If all you want to do is correct these by hand then:
lwf[lwf$bushno==145 , "bout3"] <- 1
Or if you want to work on a copy (safer):
lwfb <- lwf
lwfb[lwfb$bushno==145 , "bout3"] <- 1
> Any help appreciated.Thanks, MN
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/correcting-a-few-data-in-a-large-data-frame-tp2237834p2237834.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list