[R] Is there a funct to sum differences?
arthur brogard
abrogard at yahoo.com
Sun Dec 25 22:34:06 CET 2016
Hi,
thanks for this.
quote
If you want to do this kind of simple data management in R, it helps to learn some R programming.unquote
yep. I'm trying. we learn by doing, I think, that's what this is all about.
I was a programmer years ago. Now I want to find a language to work in for the present day and my present interests.
:)
ab
----- Original Message -----
From: "Fox, John" <jfox at mcmaster.ca>
To: arthur brogard <abrogard at yahoo.com>
Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; "r-help at r-project.org" <r-help at r-project.org>
Sent: Monday, 26 December 2016, 2:23
Subject: RE: [R] Is there a funct to sum differences?
Dear Arthur,
Neither my nor Jeff Newmiller's solution uses any fancy math, just a little bit of programming. Here are the two solutions on a much larger simulated problem:
> set.seed(12345) # for reproducibility
> x <- rnorm(1e5)
> len <- length(x)
> maxlag <- 100
>
> # John:
> system.time(
+ {
+ diffs <- matrix(0, len, maxlag)
+ for (lag in 1:maxlag){
+ diffs[1:(len - lag), lag] <- diff(x, lag=lag)
+ }
+ }
+ )
user system elapsed
0.22 0.19 0.41
> head(rowSums(diffs))
[1] -34.39477 -48.65417 33.75448 67.30261 -39.10066 204.56559
>
> # Jeff:
> system.time(
+ diffs.2 <- embed(c(x, rep(NA, maxlag)), maxlag + 1) - x
+ )
user system elapsed
0.36 0.04 0.39
> head(rowSums(diffs.2, na.rm=TRUE))
[1] -34.39477 -48.65417 33.75448 67.30261 -39.10066 204.56559
My solution uses a loop, Jeff's uses the embed() function -- of which I was unaware -- which hides the loop in the function.
If you want to do this kind of simple data management in R, it helps to learn some R programming.
Best,
John
> -----Original Message-----
> From: arthur brogard [mailto:abrogard at yahoo.com]
> Sent: Saturday, December 24, 2016 4:24 PM
> To: Fox, John <jfox at mcmaster.ca>
> Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; r-help at r-project.org
> Subject: Re: [R] Is there a funct to sum differences?
>
>
>
> Hello John,
>
>
> Here I am back again. Having learned no maths yet but I've looked over
> the results here and they are what I am after.
>
> Now I'll try to understand how you did it.
>
> :)
>
>
>
>
> ----- Original Message -----
> From: "Fox, John" <jfox at mcmaster.ca>
> To: arthur brogard <abrogard at yahoo.com>
> Cc: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; "r-help at r-project.org"
> <r-help at r-project.org>
> Sent: Sunday, 25 December 2016, 0:55
> Subject: RE: [R] Is there a funct to sum differences?
>
> Dear Arthur,
>
> Here's a simple script to do what I think you want. I've applied it to a
> contrived example, a vector of the squares of the integers 1 to 25, and
> have summed the first 5 differences, but the script is adaptable to any
> numeric vector and any maximum lag. You'll have to decide what to do
> with the last maximum-lag (in my case, 5) entries:
>
> -------------- snip ------------
> > (x <- (1:25)^2)
> [1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289
> 324 361 400 441 484 529 576
> [25] 625
> > len <- length(x)
> > maxlag <- 5
> > diffs <- matrix(0, len, maxlag)
> > for (lag in 1:maxlag){
> + diffs[1:(len - lag), lag] <- diff(x, lag=lag) }
> > head(diffs)
> [,1] [,2] [,3] [,4] [,5]
> [1,] 3 8 15 24 35
> [2,] 5 12 21 32 45
> [3,] 7 16 27 40 55
> [4,] 9 20 33 48 65
> [5,] 11 24 39 56 75
> [6,] 13 28 45 64 85
> > tail(diffs)
> [,1] [,2] [,3] [,4] [,5]
> [20,] 41 84 129 176 225
> [21,] 43 88 135 184 0
> [22,] 45 92 141 0 0
> [23,] 47 96 0 0 0
> [24,] 49 0 0 0 0
> [25,] 0 0 0 0 0
> > rowSums(diffs)
> [1] 85 115 145 175 205 235 265 295 325 355 385 415 445 475 505 535 565
> 595 625 655 450 278 143 49
> [25] 0
> -------------- snip ------------
>
> The script could very simply be converted into a function if this is a
> repetitive task with variable inputs.
>
> I hope this helps,
> John
>
> -----------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
> > -----Original Message-----
> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of arthur
> > brogard via R-help
> > Sent: December 24, 2016 12:29 AM
> > To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> > Cc: r-help at r-project.org
> > Subject: Re: [R] Is there a funct to sum differences?
> >
> > Yes, sorry about that. I keep making mistakes I shouldn't make.
> >
> > Thanks for the tip about 'reply all', I had no idea.
> >
> > You can ignore the finalone. I have been doing other work on this and
> > it comes from there. I took the example from the R screen after it had
> > run one of these other things that created the finalone.
> >
> > I guess I was thinking just seeing the data mentioned in the code was
> > be enough.
> >
> > I don't want a function to do the division and multiplication.
> >
> > It's a function that will ".. automatically sum the difference between
> > the first
> >
> > and subsequent to the end of a list? " that I am looking for.
> >
> > I will try to explain, I know I often don't make myself clear:
> >
> > I'm using this diff() function.
> >
> > This 'diff()' function finds the difference between two adjoining
> > entries and it applies itself to the whole list so that in an instant
> > I can have a list of differences between any two adjoining.
> >
> > Then I can have a list of differences between any two with any
> > specified gap - 'lag' it is called.
> > Using the same function.
> >
> > Now I have them and do that. Then I add them together to find the
> 'lastone'
> > which is the total difference for the period.
> >
> >
> > Now here's the point: that covers a period of two timespans, months,
> they are.
> >
> > if I want to cover a span of 24 months, say, then I would have to
> > write this
> > diff() function 24 times.
> >
> > what I'm doing is finding the difference between the starting point
> > and every other point and then adding them all together. bit like
> > finding the area beneath the curve maybe.
> >
> > And that's what I want to do.
> >
> > :)
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > ----- Original Message -----
> > From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> > To: arthur brogard <abrogard at yahoo.com>
> > Cc: r-help at r-project.org
> > Sent: Saturday, 24 December 2016, 15:34
> > Subject: Re: [R] Is there a funct to sum differences?
> >
> > You need to "reply all" so other people can help as well, and others
> > can learn from your questions.
> >
> > I am still puzzled by how you expect to compute "finalone". If you had
> > supplied numbers other than all 5's it might have been easier to
> > figure out what is going on.
> >
> > What is your purpose in performing this calculation?
> >
> > #### reproducible code
> > rates <- read.table( text =
> > "Date Int
> > Jan-1959 5
> > Feb-1959 5
> > Mar-1959 5
> > Apr-1959 5
> > May-1959 5
> > Jun-1959 5
> > Jul-1959 5
> > Aug-1959 5
> > Sep-1959 5
> > Oct-1959 5
> > Nov-1959 5
> > ", header = TRUE, colClasses = c( "character", "numeric" ) )
> >
> > #your code
> > rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
> > c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
> > rates$nextone)/6.5*1000 # I doubt there is a ready-built function that
> > knows you want to # divide by 6.5 or multiply by 1000
> >
> > # form a vector from positions 2:11 and append NA)
> > rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA ) # numbers
> > that are not all the same
> > rates$Int2 <- (1:11)^2
> > rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )
> >
> > # dput(rates)
> > result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959",
> > "Apr- 1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959",
> > "Sep-1959", "Oct- 1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5,
> > 5, 5, 5), thisone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA), nextone =
> > c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA), lastone = c(0, 0, 0, 0, 0, 0, 0,
> > 0, 0, NA, NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121),
> > experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, NA),
> > experiment2 = c(5, 13, 25, 41, 61, 85, 113, 145, 181, 221, NA)),
> > .Names = c("Date", "Int", "thisone", "nextone", "lastone", "Int2",
> > "experiment1", "experiment2"), row.names = c(NA, -11L), class =
> > "data.frame")
> >
> > On Sat, 24 Dec 2016, arthur brogard wrote:
> >
> > >
> > >
> > > Yes, sure, thanks for your interest. I apologise for not submitting
> > > in the
> > correct manner. I'll learn (I hope).
> > >
> > > Here's the source - a spreadsheet with just two columns, date and
> 'Int'.
> > >
> > >
> > > Date Int
> > > Jan-1959 5
> > > Feb-1959 5
> > > Mar-1959 5
> > > Apr-1959 5
> > > May-1959 5
> > > Jun-1959 5
> > > Jul-1959 5
> > > Aug-1959 5
> > > Sep-1959 5
> > > Oct-1959 5
> > > Nov-1959 5
> > >
> > >
> > > After processing it becomes this:
> > >
> > >
> > >> rates
> > > Date Int thisone nextone lastone finalone
> > > 1 1959-01-01 5.00 0.00 0.00 0.000000 10
> > > 2 1959-02-01 5.00 0.00 0.00 0.000000 10
> > > 3 1959-03-01 5.00 0.00 0.00 0.000000 10
> > > 4 1959-04-01 5.00 0.00 0.00 0.000000 10
> > > 5 1959-05-01 5.00 0.00 0.00 0.000000 10
> > > 6 1959-06-01 5.00 0.00 0.00 0.000000 10
> > >
> > > The one long column I'm referring to is the 'Int' column which R has
> imported.
> > >
> > > The actual code is:
> > >
> > >
> > > rates <- read.csv("Rates2.csv",header =
> > > TRUE,colClasses=c("character","numeric"))
> > >
> > > sapply(rates,class)
> > >
> > > rates$Date <- strptime(paste0("1-", rates$Date), format="%d-%b-%Y",
> > > tz="UTC")
> > >
> > >
> > > rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
> > > c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
> > > rates$nextone)/6.5*1000
> > >
> > >
> > > rates
> > >
> > >
> > >
> > > ab
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> > > To: arthur brogard <abrogard at yahoo.com>; arthur brogard via R-help
> > > <r-help at r-project.org>; "r-help at r-project.org"
> > > <r-help at r-project.org>
> > > Sent: Saturday, 24 December 2016, 13:25
> > > Subject: Re: [R] Is there a funct to sum differences?
> > >
> > > Could you make your example reproducible? That is, include some
> > > sample
> > input and output. You talk about a column of numbers and then you seem
> > to work with named lists and I can't reconcile your words with the
> code I see.
> > > --
> > > Sent from my phone. Please excuse my brevity.
> > >
> > >
> > > On December 23, 2016 3:40:18 PM PST, arthur brogard via R-help
> > > <r-help at r-
> > project.org> wrote:
> > >> I've been looking but I can't find a function to sum difference.
> > >>
> > >> I have this code:
> > >>
> > >>
> > >> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
> > >> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
> > >> rates$nextone)
> > >>
> > >>
> > >> It is looking down one long column of numbers.
> > >>
> > >> It sums the difference between the first two and then between the
> > >> first and third and so on.
> > >>
> > >> Can it be made to automatically sum the difference between the
> > >> first and subsequent to the end of a list?
> > >>
> > >> ______________________________________________
> > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > ----------------------------------------------------------------------
> -----
> > Jeff Newmiller The ..... ..... Go
> Live...
> > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
> Go...
> > Live: OO#.. Dead: OO#..
> Playing
> > Research Engineer (Solar/Batteries O.O#. #.O#. with
> > /Software/Embedded Controllers) .OO#. .OO#.
> rocks...1k
>
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list