[R] Average 2 Columns when possible, or return available value
Phil Spector
spector at stat.berkeley.edu
Sat Jun 26 01:15:30 CEST 2010
Eric -
What you're describing is taking the mean of each row while
ignoring missing values:
> apply(DF,1,mean,na.rm=TRUE)
[1] 22.60 NaN NaN NaN NaN NaN NaN NaN 102.00 19.20
[11] 19.20 NaN NaN NaN 11.80 7.62 NaN NaN NaN NaN
[21] NaN 75.00 NaN 18.00 NaN 12.90
If this isn't suitable for your larger problem, please describe that
problem in greater detail.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Fri, 25 Jun 2010, emorway wrote:
>
> Forum,
>
> Using the following data:
>
> DF<-read.table(textConnection("A B
> 22.60 NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> 102.00 NA
> 19.20 NA
> 19.20 NA
> NA NA
> NA NA
> NA NA
> 11.80 NA
> 7.62 NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> 75.00 NA
> NA NA
> 18.30 18.2
> NA NA
> NA NA
> 8.44 NA
> 18.00 NA
> NA NA
> 12.90 NA"),header=T)
> closeAllConnections()
>
> The second column is a duplicate reading of the first column, and when two
> values are available, I would like to average column 1 and 2 (example code
> below). But if there is only one reading, I would like to retain it, but I
> haven't found a good way to exclude NA's using the following code:
>
> t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean)[,-1]))
>
> Currently, row 24 is the only row with a returned value. I'd like the
> result to return column "A" if it is the only available value, and average
> where possible. Of course, if both columns are NA, NA is the only possible
> result.
>
> The result I'm after would look like this (row 24 is an avg):
>
> 22.60
> NA
> NA
> NA
> NA
> NA
> NA
> NA
> 102.00
> 19.20
> 19.20
> NA
> NA
> NA
> 11.80
> 7.62
> NA
> NA
> NA
> NA
> NA
> 75.00
> NA
> 18.25
> NA
> NA
> 8.44
> 18.00
> NA
> 12.90
>
> This is a small example from a much larger data frame, so if you're
> wondering what the deal is with list(), that will come into play for the
> larger problem I'm trying to solve.
>
> Respectfully,
> Eric
> --
> View this message in context: http://r.789695.n4.nabble.com/Average-2-Columns-when-possible-or-return-available-value-tp2269049p2269049.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list