[R] correcting a few data in a large data frame

David Winsemius dwinsemius at comcast.net
Tue Jun 1 00:28:22 CEST 2010


On May 31, 2010, at 5:29 PM, Mr. Natural wrote:

>
> The data frame is lwf that records the survival of bushes over an 8  
> year
> period. Years are called bouts. Dead bushes are recorded as zeros,  
> and live
> bushes as "1."
> str(lwf)
> 'data.frame':   638 obs. of  9 variables:
> $ bushno: int  1 2 3 4 5 6 7 8 9 10 ...
> $ bout1 : int  0 1 0 1 1 1 0 1 0 1 ...
> $ bout2 : int  0 1 0 0 0 0 0 0 0 1 ...
> $ bout3 : int  0 1 0 0 0 0 0 0 0 1 ...
> $ bout4 : int  0 1 0 0 0 0 0 0 0 0 ...
> $ bout5 : int  0 1 0 0 0 0 0 0 0 0 ...
> $ bout6 : int  0 1 0 0 0 0 0 0 0 0 ...
> $ bout7 : int  0 1 0 0 0 0 0 0 0 0 ...
> $ bout8 : int  0 1 0 0 0 0 0 0 0 0 ...
>
> head(lwf)
>  bushno bout1 bout2 bout3 bout4 bout5 bout6 bout7 bout8
> 1      1     0     0     0     0     0     0     0     0
> 2      2     1     1     1     1     1     1     1     1
> 3      3     0     0     0     0     0     0     0     0
> 4      4     1     0     0     0     0     0     0     0
> 5      5     1     0     0     0     0     0     0     0
> 6      6     1     0     0     0     0     0     0     0
>
> A number of the data are incorrect. For example, that for bush 145  
> in year
> three is recorded as dead="0"
> when it should be alive ="1."  The bushes do not come back to life  
> after
> they die.
>
>> lwf[lw$bushno==145,]
>    bushno bout1 bout2 bout3 bout4 bout5 bout6 bout7 bout8
> 144    145     1     1     0     1     1     1     1     1
>
>
> I know that I can do this with fix(lwf) or edit(lwf). However, I  
> would like
> to learn some more R.
> What code could I use to correct these data?

rle is a function that records lengths of runs and values. Your  
problem is to find rows where the length of the rle encoded data is  
more than two. Perhaps something like:

apply(lwf[ , -1], 1, function(x){ length( rle(x)$values ) >2 } )

>
> I have been screwing around with such as
> lwfb[(lwf$bushno==145) & (lwf$bout3==0),0]<- lwf[(lwf$bushno==145) &
> (lwf$bout3==0),1]
> to no avail.

If all you want to do is correct these by hand then:

lwf[lwf$bushno==145 , "bout3"] <- 1

Or if you want to work on a copy (safer):

lwfb <- lwf
lwfb[lwfb$bushno==145 , "bout3"] <- 1

> Any help appreciated.Thanks, MN
>
>
>
> -- 
> View this message in context: http://r.789695.n4.nabble.com/correcting-a-few-data-in-a-large-data-frame-tp2237834p2237834.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list