[R] Applying a certain formula to a repeated sample data

Jeff Newmiller jdnewmil @ending from dcn@d@vi@@c@@u@
Wed Nov 28 07:10:12 CET 2018


Thank you for providing a clarifying example. I think a useful function 
for you to get familiar with is the "ave" function. It is kind of like 
aggregate except that it works when the operation you want to apply to the 
group of elements will returns the same number of elements as were given 
to it.

Also, in the future please figure out how to tell gmail to send plain text 
to the mailing list instead of HTML. You were lucky this time, but often 
HTML email gets horribly mangled as it goes through the mailing list and 
gets all the formatting removed.

###############################
dta <- read.table( text =
"n CR WW
1 8590 12516
2 8641 98143
3 8705 98916
4 8750 89911
5 8685 104835
6 8629 121963
7 8676 77655
1 8577 81081
2 8593 83385
3 8642 112164
4 8708 103684
5 8622 83982
6 8593 75944
7 8600 97036
1 8650 104911
2 8730 114098
3 8731 99421
4 8715 85707
5 8717 81273
6 8739 106462
7 8684 110635
1 8713 105214
2 8771 92456
3 8759 109270
4 8762 99150
5 8730 77306
6 8780 86324
7 8804 90214
1 8797 99894
2 8863 95177
3 8873 95910
4 8827 108511
5 8806 115636
6 8869 85542
7 8854 111018
1 8571 93247
2 8533 85105
3 8553 114725
4 8561 122195
5 8532 100945
6 8560 108552
7 8634 108707
1 8646 117420
2 8633 113823
3 8680 82763
4 8765 121072
5 8756 89835
6 8750 104578
7 8790 88429
",header=TRUE)

# one way to make a grouping vector
dta$G <- cumsum( c( 1, diff( dta$n ) < 0 ) 
)
# your operation
fn <- function( x ) {
   m <- mean( x )
  ( x - m ) / m * 100
}
# your operation, computing for each group
gn <- function( x, g ) {
   ave( x, g, FUN = fn )
}
# do the computations
dta$CRpct <- gn( dta$CR, dta$G )
dta$WWpct <- gn( dta$WW, dta$G )
dta
#>    n   CR     WW G        CRpct       WWpct
#> 1  1 8590  12516 1 -0.899861560 -85.4932369
#> 2  2 8641  98143 1 -0.311490540  13.7533758
#> 3  3 8705  98916 1  0.426857407  14.6493272
#> 4  4 8750  89911 1  0.946008306   4.2120148
#> 5  5 8685 104835 1  0.196123673  21.5097882
#> 6  6 8629 121963 1 -0.449930780  41.3621243
#> 7  7 8676  77655 1  0.092293493  -9.9933934
#> 8  1 8577  81081 2 -0.490594182 -10.9385886
#> 9  2 8593  83385 2 -0.304963951  -8.4078170
#> 10 3 8642 112164 2  0.263528632  23.2037610
#> 11 4 8708 103684 2  1.029253336  13.8891155
#> 12 5 8622  83982 2  0.031490843  -7.7520572
#> 13 6 8593  75944 2 -0.304963951 -16.5811987
#> 14 7 8600  97036 2 -0.223750725   6.5867850
#> 15 1 8650 104911 3 -0.682347538   4.5366096
#> 16 2 8730 114098 3  0.236197225  13.6908244
#> 17 3 8731  99421 3  0.247679034  -0.9337985
#> 18 4 8715  85707 3  0.063970082 -14.5988581
#> 19 5 8717  81273 3  0.086933701 -19.0170347
#> 20 6 8739 106462 3  0.339533510   6.0820746
#> 21 7 8684 110635 3 -0.291966014  10.2401827
#> 22 1 8713 105214 4 -0.534907614  11.6017662
#> 23 2 8771  92456 4  0.127203640  -1.9307991
#> 24 3 8759 109270 4 -0.009784895  15.9040146
#> 25 4 8762  99150 4  0.024462238   5.1696079
#> 26 5 8730  77306 4 -0.340840523 -18.0005879
#> 27 6 8780  86324 4  0.229945042  -8.4350859
#> 28 7 8804  90214 4  0.503922112  -4.3089157
#> 29 1 8797  99894 5 -0.500896767  -1.7465519
#> 30 2 8863  95177 5  0.245600995  -6.3860849
#> 31 3 8873  95910 5  0.358706717  -5.6651229
#> 32 4 8827 108511 5 -0.161579602   6.7289318
#> 33 5 8806 115636 5 -0.399101617  13.7369184
#> 34 6 8869  85542 5  0.313464428 -15.8628500
#> 35 7 8854 111018 5  0.143805846   9.1947595
#> 36 1 8571  93247 6  0.088415855 -11.0088128
#> 37 2 8533  85105 6 -0.355331643 -18.7792102
#> 38 3 8553 114725 6 -0.121780328   9.4889267
#> 39 4 8561 122195 6 -0.028359802  16.6179943
#> 40 5 8532 100945 6 -0.367009209  -3.6621512
#> 41 6 8560 108552 6 -0.040037368   3.5976637
#> 42 7 8634 108707 6  0.824102496   3.7455895
#> 43 1 8646 117420 7 -0.816125860  14.4890796
#> 44 2 8633 113823 7 -0.965257293  10.9818643
#> 45 3 8680  82763 7 -0.426089807 -19.3028471
#> 46 4 8765 121072 7  0.549000328  18.0499220
#> 47 5 8756  89835 7  0.445755490 -12.4073713
#> 48 6 8750 104578 7  0.376925598   1.9676287
#> 49 7 8790  88429 7  0.835791544 -13.7782761

#' Created on 2018-11-27 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
###############################

On Wed, 28 Nov 2018, Ogbos Okike wrote:

> Dear Jim,
>
> I don't think my problem is clear the way I put.
>
> I have been trying to manually apply the formula to some rows.
>
> This is what I have done.
> I cut and past some rows from 1-7 and save each with a different file as
> shown below:
>
> 1 8590 12516
> 2 8641 98143
> 3 8705 98916
> 4 8750 89911
> 5 8685 104835
> 6 8629 121963
> 7 8676 77655
>
>
> 1 8577 81081
> 2 8593 83385
> 3 8642 112164
> 4 8708 103684
> 5 8622 83982
> 6 8593 75944
> 7 8600 97036
>
>
> 1 8650 104911
> 2 8730 114098
> 3 8731 99421
> 4 8715 85707
> 5 8717 81273
> 6 8739 106462
> 7 8684 110635
>
>
> 1 8713 105214
> 2 8771 92456
> 3 8759 109270
> 4 8762 99150
> 5 8730 77306
> 6 8780 86324
> 7 8804 90214
>
>
> 1 8797 99894
> 2 8863 95177
> 3 8873 95910
> 4 8827 108511
> 5 8806 115636
> 6 8869 85542
> 7 8854 111018
>
>
> 1 8571 93247
> 2 8533 85105
> 3 8553 114725
> 4 8561 122195
> 5 8532 100945
> 6 8560 108552
> 7 8634 108707
>
>
> 1 8646 117420
> 2 8633 113823
> 3 8680 82763
> 4 8765 121072
> 5 8756 89835
> 6 8750 104578
> 7 8790 88429
>
> Each of them are then read as:
> d1<-read.table("dat1",col.names=c("n","CR","WW"))
> d2<-read.table("dat2",col.names=c("n","CR","WW"))
> d3<-read.table("dat3",col.names=c("n","CR","WW"))
> d4<-read.table("dat4",col.names=c("n","CR","WW"))
> d5<-read.table("dat5",col.names=c("n","CR","WW"))
> d6<-read.table("dat6",col.names=c("n","CR","WW"))
> d7<-read.table("dat7",col.names=c("n","CR","WW"))
>
> And my formula for percentage change applied as follows for column 2:
> a1<-((d1$CR-mean(d1$CR))/mean(CR))*100
> a2<-((d2$CR-mean(d2$CR))/mean(CR))*100
> a3<-((d3$CR-mean(d3$CR))/mean(CR))*100
> a4<-((d4$CR-mean(d4$CR))/mean(CR))*100
> a5<-((d5$CR-mean(d5$CR))/mean(CR))*100
> a6<-((d6$CR-mean(d6$CR))/mean(CR))*100
> a7<-((d7$CR-mean(d7$CR))/mean(CR))*100
>
> a1-a7 actually gives percentage change in the data.
>
> Instead of doing this one after the other, can you please give an
> indication on how I may apply this formula to the data frame with probably
> a code.
>
> Thank you again.
>
> Best
> Ogbos
>
> On Wed, Nov 28, 2018 at 5:15 AM Ogbos Okike <giftedlife2014 using gmail.com>
> wrote:
>
>> Dear Jim,
>>
>> I wish also to use the means calculated and apply a certain formula on
>> the  same data frame. In particular, I would like to subtract the means of
>> each of these seven days from each of the seven days and and divide the
>> outcome by the same means. If I represent m1 by the means of each seven
>> days in column 1, and c1 is taken as column 1 data. My formula will be of
>> the form:
>> aa<-(c1-m1)/m1.
>>
>> I tried it on the first 7 rows and I have what I am looking for.:
>>  -0.0089986156
>>   -0.0031149054
>>    0.0042685741
>>    0.0094600831
>>    0.0019612367
>>   -0.0044993078
>>    0.0009229349
>>
>> But doing it manually will take much time.
>>
>> Many thanks for going a step further to assist me.
>>
>> Warmest regards.
>> Ogbos
>>
>> On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <drjimlemon using gmail.com> wrote:
>>
>>> Hi Ogbos,
>>> If we assume that you have a 3 column data frame named oodf, how about:
>>>
>>> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
>>> col2means<-by(oodf[,2],oodf[,4],mean)
>>> col3means<-by(oodf[,3],oodf[,4],mean)
>>>
>>> Jim
>>>
>>> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <giftedlife2014 using gmail.com>
>>> wrote:
>>>>
>>>> Dear List,
>>>> I have three data-column data. The data is of the form:
>>>> 1 8590 12516
>>>> 2 8641 98143
>>>> 3 8705 98916
>>>> 4 8750 89911
>>>> 5 8685 104835
>>>> 6 8629 121963
>>>> 7 8676 77655
>>>> 1 8577 81081
>>>> 2 8593 83385
>>>> 3 8642 112164
>>>> 4 8708 103684
>>>> 5 8622 83982
>>>> 6 8593 75944
>>>> 7 8600 97036
>>>> 1 8650 104911
>>>> 2 8730 114098
>>>> 3 8731 99421
>>>> 4 8715 85707
>>>> 5 8717 81273
>>>> 6 8739 106462
>>>> 7 8684 110635
>>>> 1 8713 105214
>>>> 2 8771 92456
>>>> 3 8759 109270
>>>> 4 8762 99150
>>>> 5 8730 77306
>>>> 6 8780 86324
>>>> 7 8804 90214
>>>> 1 8797 99894
>>>> 2 8863 95177
>>>> 3 8873 95910
>>>> 4 8827 108511
>>>> 5 8806 115636
>>>> 6 8869 85542
>>>> 7 8854 111018
>>>> 1 8571 93247
>>>> 2 8533 85105
>>>> 3 8553 114725
>>>> 4 8561 122195
>>>> 5 8532 100945
>>>> 6 8560 108552
>>>> 7 8634 108707
>>>> 1 8646 117420
>>>> 2 8633 113823
>>>> 3 8680 82763
>>>> 4 8765 121072
>>>> 5 8756 89835
>>>> 6 8750 104578
>>>> 7 8790 88429
>>>>
>>>> I wish to calculate average of the second and third columns based on the
>>>> first column for each repeated 7 days. The length of the data is 1442.
>>> That
>>>> is 206 by 7. So I should arrive at 207 data points for each of the two
>>>> columns after calculating the mean of each group 1-7.
>>>>
>>>> I have both tried factor/tapply and aggregate functions but seem not to
>>> be
>>>> making progress.
>>>>
>>>> Thank you very much for your idea.
>>>>
>>>> Best wishes
>>>> Ogbos
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil using dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list