[R] Is there a funct to sum differences?

Bert Gunter bgunter.4567 at gmail.com
Sun Dec 25 22:46:39 CET 2016


Arthur:

There are many good R tutorials on the web. e.g. see here for some
recommendations:

https://www.rstudio.com/online-learning/#R

I think you would do better to go through some structured tutorials
then continuing to fool around in this way on your own. Just mho of
course.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Dec 25, 2016 at 1:34 PM, arthur brogard via R-help
<r-help at r-project.org> wrote:
> Hi,
>
> thanks for this.
>
>
> quote
> If you want to do this kind of simple data management in R, it helps to learn some R programming.unquote
>
> yep. I'm trying. we learn by doing, I think, that's what this is all about.
>
> I was a programmer years ago.  Now I want to find a language to work in for the present day and my present interests.
>
>
> :)
>
> ab
>
>
>
>
>
>
> ----- Original Message -----
> From: "Fox, John" <jfox at mcmaster.ca>
> To: arthur brogard <abrogard at yahoo.com>
> Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; "r-help at r-project.org" <r-help at r-project.org>
> Sent: Monday, 26 December 2016, 2:23
> Subject: RE: [R] Is there a funct to sum differences?
>
> Dear Arthur,
>
> Neither my nor Jeff Newmiller's solution uses any fancy math, just a little bit of programming. Here are the two solutions on a much larger simulated problem:
>
>> set.seed(12345) # for reproducibility
>> x <- rnorm(1e5)
>> len <- length(x)
>> maxlag <- 100
>>
>> # John:
>> system.time(
> +     {
> +         diffs <- matrix(0, len, maxlag)
> +         for (lag in 1:maxlag){
> +             diffs[1:(len - lag), lag] <- diff(x, lag=lag)
> +         }
> +     }
> + )
>    user  system elapsed
>    0.22    0.19    0.41
>> head(rowSums(diffs))
> [1] -34.39477 -48.65417  33.75448  67.30261 -39.10066 204.56559
>>
>> # Jeff:
>> system.time(
> +     diffs.2 <- embed(c(x, rep(NA, maxlag)), maxlag + 1) - x
> + )
>    user  system elapsed
>    0.36    0.04    0.39
>> head(rowSums(diffs.2, na.rm=TRUE))
> [1] -34.39477 -48.65417  33.75448  67.30261 -39.10066 204.56559
>
> My solution uses a loop, Jeff's uses the embed() function -- of which I was unaware -- which hides the loop in the function.
>
> If you want to do this kind of simple data management in R, it helps to learn some R programming.
>
> Best,
> John
>
>
>> -----Original Message-----
>> From: arthur brogard [mailto:abrogard at yahoo.com]
>> Sent: Saturday, December 24, 2016 4:24 PM
>> To: Fox, John <jfox at mcmaster.ca>
>> Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; r-help at r-project.org
>> Subject: Re: [R] Is there a funct to sum differences?
>>
>>
>>
>> Hello John,
>>
>>
>> Here I am back again. Having learned no maths yet but I've looked over
>> the results here and they are what I am after.
>>
>> Now I'll try to understand how you did it.
>>
>> :)
>>
>>
>>
>>
>> ----- Original Message -----
>> From: "Fox, John" <jfox at mcmaster.ca>
>> To: arthur brogard <abrogard at yahoo.com>
>> Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; "r-help at r-project.org"
>> <r-help at r-project.org>
>> Sent: Sunday, 25 December 2016, 0:55
>> Subject: RE: [R] Is there a funct to sum differences?
>>
>> Dear Arthur,
>>
>> Here's a simple script to do what I think you want. I've applied it to a
>> contrived example, a vector of the squares of the integers 1 to 25, and
>> have summed the first 5 differences, but the script is adaptable to any
>> numeric vector and any maximum lag. You'll have to decide what to do
>> with the last maximum-lag (in my case, 5) entries:
>>
>> -------------- snip ------------
>> > (x <- (1:25)^2)
>> [1]   1   4   9  16  25  36  49  64  81 100 121 144 169 196 225 256 289
>> 324 361 400 441 484 529 576
>> [25] 625
>> > len <- length(x)
>> > maxlag <- 5
>> > diffs <- matrix(0, len, maxlag)
>> > for (lag in 1:maxlag){
>> +     diffs[1:(len - lag), lag] <- diff(x, lag=lag) }
>> > head(diffs)
>>      [,1] [,2] [,3] [,4] [,5]
>> [1,]    3    8   15   24   35
>> [2,]    5   12   21   32   45
>> [3,]    7   16   27   40   55
>> [4,]    9   20   33   48   65
>> [5,]   11   24   39   56   75
>> [6,]   13   28   45   64   85
>> > tail(diffs)
>>       [,1] [,2] [,3] [,4] [,5]
>> [20,]   41   84  129  176  225
>> [21,]   43   88  135  184    0
>> [22,]   45   92  141    0    0
>> [23,]   47   96    0    0    0
>> [24,]   49    0    0    0    0
>> [25,]    0    0    0    0    0
>> > rowSums(diffs)
>> [1]  85 115 145 175 205 235 265 295 325 355 385 415 445 475 505 535 565
>> 595 625 655 450 278 143  49
>> [25]   0
>> -------------- snip ------------
>>
>> The script could very simply be converted into a function if this is a
>> repetitive task with variable inputs.
>>
>> I hope this helps,
>> John
>>
>> -----------------------------
>> John Fox, Professor
>> McMaster University
>> Hamilton, Ontario
>> Canada L8S 4M4
>> Web: socserv.mcmaster.ca/jfox
>>
>>
>>
>> > -----Original Message-----
>> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of arthur
>> > brogard via R-help
>> > Sent: December 24, 2016 12:29 AM
>> > To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> > Cc: r-help at r-project.org
>> > Subject: Re: [R] Is there a funct to sum differences?
>> >
>> > Yes, sorry about that.  I keep making mistakes I shouldn't make.
>> >
>> > Thanks for the tip about 'reply all', I had no idea.
>> >
>> > You can ignore the finalone. I have been doing other work on this and
>> > it comes from there. I took the example from the R screen after it had
>> > run one of these other things that created the finalone.
>> >
>> > I guess I was thinking just seeing the data mentioned in the code was
>> > be enough.
>> >
>> > I don't want a function to do the division and multiplication.
>> >
>> > It's a function that will ".. automatically sum the difference between
>> > the first
>> >
>> >  and subsequent to the end of a list? "  that I am looking for.
>> >
>> > I will try to explain, I know I often don't make myself clear:
>> >
>> > I'm using this diff() function.
>> >
>> > This 'diff()' function finds the difference between two adjoining
>> > entries and it applies itself to the whole list so that in an instant
>> > I can have a list of differences between any two adjoining.
>> >
>> > Then I can have a list of differences between any two with any
>> > specified gap - 'lag' it is called.
>> > Using the same function.
>> >
>> > Now I have them and do that.  Then I add them together to find the
>> 'lastone'
>> > which is the total difference for the period.
>> >
>> >
>> > Now here's the point:  that covers a period of two timespans, months,
>> they are.
>> >
>> >  if I want to cover a span of 24 months, say, then I would have to
>> > write this
>> > diff() function 24 times.
>> >
>> >  what I'm doing is finding the difference between the starting point
>> > and every other point and then adding them all together.  bit like
>> > finding the area beneath the curve maybe.
>> >
>> >  And that's what I want to do.
>> >
>> >  :)
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > ----- Original Message -----
>> > From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> > To: arthur brogard <abrogard at yahoo.com>
>> > Cc: r-help at r-project.org
>> > Sent: Saturday, 24 December 2016, 15:34
>> > Subject: Re: [R] Is there a funct to sum differences?
>> >
>> > You need to "reply all" so other people can help as well, and others
>> > can learn from your questions.
>> >
>> > I am still puzzled by how you expect to compute "finalone". If you had
>> > supplied numbers other than all 5's it might have been easier to
>> > figure out what is going on.
>> >
>> > What is your purpose in performing this calculation?
>> >
>> > #### reproducible code
>> > rates <- read.table( text =
>> > "Date          Int
>> > Jan-1959        5
>> > Feb-1959        5
>> > Mar-1959        5
>> > Apr-1959        5
>> > May-1959        5
>> > Jun-1959        5
>> > Jul-1959        5
>> > Aug-1959        5
>> > Sep-1959        5
>> > Oct-1959        5
>> > Nov-1959        5
>> > ", header = TRUE, colClasses = c( "character", "numeric" ) )
>> >
>> > #your code
>> > rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>> > c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>> > rates$nextone)/6.5*1000 # I doubt there is a ready-built function that
>> > knows you want to # divide by 6.5 or multiply by 1000
>> >
>> > # form a vector from positions 2:11 and append NA)
>> > rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA ) # numbers
>> > that are not all the same
>> > rates$Int2 <- (1:11)^2
>> > rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )
>> >
>> > # dput(rates)
>> > result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959",
>> > "Apr- 1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959",
>> > "Sep-1959", "Oct- 1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5,
>> > 5, 5, 5), thisone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA), nextone =
>> > c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA), lastone = c(0, 0, 0, 0, 0, 0, 0,
>> > 0, 0, NA, NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121),
>> > experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, NA),
>> > experiment2 = c(5, 13, 25, 41, 61, 85, 113, 145, 181, 221, NA)),
>> > .Names = c("Date", "Int", "thisone", "nextone", "lastone", "Int2",
>> > "experiment1", "experiment2"), row.names = c(NA, -11L), class =
>> > "data.frame")
>> >
>> > On Sat, 24 Dec 2016, arthur brogard wrote:
>> >
>> > >
>> > >
>> > > Yes, sure, thanks for your interest.  I apologise for not submitting
>> > > in the
>> > correct manner.  I'll learn (I hope).
>> > >
>> > > Here's the source - a spreadsheet with just two columns, date and
>> 'Int'.
>> > >
>> > >
>> > > Date    Int
>> > > Jan-1959    5
>> > > Feb-1959    5
>> > > Mar-1959    5
>> > > Apr-1959    5
>> > > May-1959    5
>> > > Jun-1959    5
>> > > Jul-1959    5
>> > > Aug-1959    5
>> > > Sep-1959    5
>> > > Oct-1959    5
>> > > Nov-1959    5
>> > >
>> > >
>> > > After processing it becomes this:
>> > >
>> > >
>> > >> rates
>> > > Date   Int thisone nextone     lastone finalone
>> > > 1   1959-01-01  5.00    0.00    0.00    0.000000       10
>> > > 2   1959-02-01  5.00    0.00    0.00    0.000000       10
>> > > 3   1959-03-01  5.00    0.00    0.00    0.000000       10
>> > > 4   1959-04-01  5.00    0.00    0.00    0.000000       10
>> > > 5   1959-05-01  5.00    0.00    0.00    0.000000       10
>> > > 6   1959-06-01  5.00    0.00    0.00    0.000000       10
>> > >
>> > > The one long column I'm referring to is the 'Int' column which R has
>> imported.
>> > >
>> > > The actual code is:
>> > >
>> > >
>> > > rates <- read.csv("Rates2.csv",header =
>> > > TRUE,colClasses=c("character","numeric"))
>> > >
>> > > sapply(rates,class)
>> > >
>> > > rates$Date <- strptime(paste0("1-", rates$Date), format="%d-%b-%Y",
>> > > tz="UTC")
>> > >
>> > >
>> > > rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>> > > c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>> > > rates$nextone)/6.5*1000
>> > >
>> > >
>> > > rates
>> > >
>> > >
>> > >
>> > > ab
>> > >
>> > >
>> > >
>> > > ----- Original Message -----
>> > > From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> > > To: arthur brogard <abrogard at yahoo.com>; arthur brogard via R-help
>> > > <r-help at r-project.org>; "r-help at r-project.org"
>> > > <r-help at r-project.org>
>> > > Sent: Saturday, 24 December 2016, 13:25
>> > > Subject: Re: [R] Is there a funct to sum differences?
>> > >
>> > > Could you make your example reproducible? That is, include some
>> > > sample
>> > input and output. You talk about a column of numbers and then you seem
>> > to work with named lists and I can't reconcile your words with the
>> code I see.
>> > > --
>> > > Sent from my phone. Please excuse my brevity.
>> > >
>> > >
>> > > On December 23, 2016 3:40:18 PM PST, arthur brogard via R-help
>> > > <r-help at r-
>> > project.org> wrote:
>> > >> I've been looking but I can't find a function to sum difference.
>> > >>
>> > >> I have this code:
>> > >>
>> > >>
>> > >> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>> > >> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>> > >> rates$nextone)
>> > >>
>> > >>
>> > >> It is looking down one long column of numbers.
>> > >>
>> > >> It sums the difference between the first two and then between the
>> > >> first and third and so on.
>> > >>
>> > >> Can it be made to automatically sum the difference between the
>> > >> first and subsequent to the end of a list?
>> > >>
>> > >> ______________________________________________
>> > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide
>> > >> http://www.R-project.org/posting-guide.html
>> > >> and provide commented, minimal, self-contained, reproducible code.
>> > >
>> >
>> > ----------------------------------------------------------------------
>> -----
>> > Jeff Newmiller                        The     .....       .....  Go
>> Live...
>> > DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>> Go...
>> >                                        Live:   OO#.. Dead: OO#..
>> Playing
>> > Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> > /Software/Embedded Controllers)               .OO#.       .OO#.
>> rocks...1k
>>
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list