[R] Newbie question: Statistical functions (e.g., mean, sd) in a "transform" statement?
Gavin Simpson
gavin.simpson at ucl.ac.uk
Fri Jan 19 19:53:31 CET 2007
On Fri, 2007-01-19 at 11:54 -0600, Ben Fairbank wrote:
> Greetings listeRs -
Here are two solutions, depending on whether you wanted the NA's or not,
and I assume you wanted the row means:
> times3 <- transform(times, meantime = rowMeans(times))
> times3
time1 time2 time3 time4 meantime
1 70.408543 48.92378 7.399605 95.93050 55.66561
2 17.231940 27.48530 82.962916 10.20619 34.47159
3 20.279220 10.33575 66.209290 30.71846 31.88568
4 NA 53.31993 12.398237 35.65782 NA
5 9.295965 NA 48.929201 NA NA
6 63.966518 42.16304 1.777342 NA NA
> times4 <- transform(times, meantime = rowMeans(times, na.rm = TRUE))
> times4
time1 time2 time3 time4 meantime
1 70.408543 48.92378 7.399605 95.93050 55.66561
2 17.231940 27.48530 82.962916 10.20619 34.47159
3 20.279220 10.33575 66.209290 30.71846 31.88568
4 NA 53.31993 12.398237 35.65782 33.79200
5 9.295965 NA 48.929201 NA 29.11258
6 63.966518 42.16304 1.777342 NA 35.96897
HTH
G
>
> Given a data frame such as
>
>
>
> times
>
> time1 time2 time3 time4
>
> 1 70.408543 48.92378 7.399605 95.93050
>
> 2 17.231940 27.48530 82.962916 10.20619
>
> 3 20.279220 10.33575 66.209290 30.71846
>
> 4 NA 53.31993 12.398237 35.65782
>
> 5 9.295965 NA 48.929201 NA
>
> 6 63.966518 42.16304 1.777342 NA
>
>
>
> one can use "transform" to total all or some columns, thus,
>
>
>
> times2 <- transform(times,totaltime=time1+time2+time3+time4)
>
>
>
> > times2
>
> time1 time2 time3 time4 totaltime
>
> 1 70.408543 48.92378 7.399605 95.93050 222.6624
>
> 2 17.231940 27.48530 82.962916 10.20619 137.8863
>
> 3 20.279220 10.33575 66.209290 30.71846 127.5427
>
> 4 NA 53.31993 12.398237 35.65782 NA
>
> 5 9.295965 NA 48.929201 NA NA
>
> 6 63.966518 42.16304 1.777342 NA NA
>
>
>
> I cannot, however, find a way, other than "for" looping,
>
> to use statistical functions, such as mean or sd, to
>
> compute the new column. For example,
>
>
>
> >
> times2<-transform(times,meantime=(mean(c(time1,time2,time3,time4),na.rm=
> TRUE)))
>
>
>
> > times2
>
>
>
> time1 time2 time3 time4 meantime
>
> 1 70.408543 48.92378 7.399605 95.93050 45.54178
>
> 2 17.231940 27.48530 82.962916 10.20619 45.54178
>
> 3 20.279220 10.33575 66.209290 30.71846 45.54178
>
> 4 NA 53.31993 12.398237 35.65782 45.54178
>
> 5 9.295965 NA 48.929201 NA 45.54178
>
> 6 63.966518 42.16304 1.777342 NA 45.54178
>
>
>
> How can this be done? And, generally, what is the recommended method
>
> for creating computed new columns in data frames when "for" loops take
>
> too long?
>
>
>
> With thanks for any suggestions,
>
>
>
> Ben Fairbank
>
>
>
> Using version 2.4.1 on a Windows XP professional operating system.
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list