[R] Is there a funct to sum differences?
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sat Dec 24 20:36:11 CET 2016
Assuming John's understanding is correct, you can also do this without for
loops. It takes getting used to vector and matrix arithmetic, which you
can read more about in the Introduction to R document that comes with R,
or on R Exercises website [1].
You indicated having a problem with my last reproducible example... it did
work, if you went through it one step at a time. If you skipped steps, you
would have problems like you encountered. For completeness, I will give
the whole reproducible example again here... don't mix in your own steps
until you have worked through all the steps in this example... or at least
if you do, go back and step through these steps one at a time if you
change something that breaks it.
[1] http://r-exercises.com/2015/11/28/matrix-exercises/
#########------ begin
rates <- read.table( text =
"Date Int
Jan-1959 5
Feb-1959 5
Mar-1959 5
Apr-1959 5
May-1959 5
Jun-1959 5
Jul-1959 5
Aug-1959 5
Sep-1959 5
Oct-1959 5
Nov-1959 5
", header = TRUE, colClasses = c( "character", "numeric" ) )
rates$thisone <- c(diff(rates$Int), NA)
rates$nextone <- c(diff(rates$Int, lag=2), NA, NA)
rates$lastone <- (rates$thisone + rates$nextone)/6.5*1000
rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA )
rates$Int2 <- (1:11)^2
rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )
# lag
N <- 5
# see ?embed, or https://en.wikipedia.org/wiki/Embedding
embed( c( rates$Int2, rep( NA, N ) ), N+1 )
# make a matrix of the same size as the embed result
matrix( rep( rates$Int2, N+1 ), ncol=N+1 )
# subtract the first values
embed( c( rates$Int2, rep( NA, N ) ), N+1 ) - rates$Int2
# or can rely on automatic replication ... depends on the
# fact that the embed result is a matrix which is really just
# a vector displayed in folded up form
embed( c( rates$Int2, rep( NA, N ) ), N+1 ) - rates$Int2
# anyway, the result can be computed in one line (wrapped for readability)
rates$experiment3 <- rowSums( embed( c( rates$Int2
, rep( NA, N )
)
, N+1
)
- rates$Int2
, na.rm=TRUE
)
> rates
Date Int thisone nextone lastone experiment1 Int2 experiment2 experiment3
1 Jan-1959 5 0 0 0 10 1 5 85
2 Feb-1959 5 0 0 0 10 4 13 115
3 Mar-1959 5 0 0 0 10 9 25 145
4 Apr-1959 5 0 0 0 10 16 41 175
5 May-1959 5 0 0 0 10 25 61 205
6 Jun-1959 5 0 0 0 10 36 85 235
7 Jul-1959 5 0 0 0 10 49 113 170
8 Aug-1959 5 0 0 0 10 64 145 110
9 Sep-1959 5 0 0 0 10 81 181 59
10 Oct-1959 5 0 NA NA 10 100 221 21
11 Nov-1959 5 NA NA NA NA 121 NA 0
#dput(rates)
result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959",
"Apr-1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959", "Sep-1959",
"Oct-1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), thisone
= c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA), nextone = c(0, 0, 0, 0, 0, 0,
0, 0, 0, NA, NA), lastone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA,
NA), experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121), experiment2 =
c(5, 13, 25, 41, 61, 85, 113, 145, 181, 221, NA), experiment3 = c(85,
115, 145, 175, 205, 235, 170, 110, 59, 21, 0)), .Names = c("Date",
"Int", "thisone", "nextone", "lastone", "experiment1", "Int2",
"experiment2", "experiment3"), row.names = c(NA, -11L), class =
"data.frame")
#########------ end
On Sat, 24 Dec 2016, Fox, John wrote:
> Dear Arthur,
>
> Here's a simple script to do what I think you want. I've applied it to a contrived example, a vector of the squares of the integers 1 to 25, and have summed the first 5 differences, but the script is adaptable to any numeric vector and any maximum lag. You'll have to decide what to do with the last maximum-lag (in my case, 5) entries:
>
> -------------- snip ------------
>> (x <- (1:25)^2)
> [1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400 441 484 529 576
> [25] 625
>> len <- length(x)
>> maxlag <- 5
>> diffs <- matrix(0, len, maxlag)
>> for (lag in 1:maxlag){
> + diffs[1:(len - lag), lag] <- diff(x, lag=lag)
> + }
>> head(diffs)
> [,1] [,2] [,3] [,4] [,5]
> [1,] 3 8 15 24 35
> [2,] 5 12 21 32 45
> [3,] 7 16 27 40 55
> [4,] 9 20 33 48 65
> [5,] 11 24 39 56 75
> [6,] 13 28 45 64 85
>> tail(diffs)
> [,1] [,2] [,3] [,4] [,5]
> [20,] 41 84 129 176 225
> [21,] 43 88 135 184 0
> [22,] 45 92 141 0 0
> [23,] 47 96 0 0 0
> [24,] 49 0 0 0 0
> [25,] 0 0 0 0 0
>> rowSums(diffs)
> [1] 85 115 145 175 205 235 265 295 325 355 385 415 445 475 505 535 565 595 625 655 450 278 143 49
> [25] 0
> -------------- snip ------------
>
> The script could very simply be converted into a function if this is a repetitive task with variable inputs.
>
> I hope this helps,
> John
>
> -----------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of arthur
>> brogard via R-help
>> Sent: December 24, 2016 12:29 AM
>> To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Is there a funct to sum differences?
>>
>> Yes, sorry about that. I keep making mistakes I shouldn't make.
>>
>> Thanks for the tip about 'reply all', I had no idea.
>>
>> You can ignore the finalone. I have been doing other work on this and it comes
>> from there. I took the example from the R screen after it had run one of these
>> other things that created the finalone.
>>
>> I guess I was thinking just seeing the data mentioned in the code was be
>> enough.
>>
>> I don't want a function to do the division and multiplication.
>>
>> It's a function that will ".. automatically sum the difference between the first
>>
>> and subsequent to the end of a list? " that I am looking for.
>>
>> I will try to explain, I know I often don't make myself clear:
>>
>> I'm using this diff() function.
>>
>> This 'diff()' function finds the difference between two adjoining entries and it
>> applies itself to the whole list so that in an instant I can have a list of
>> differences between any two adjoining.
>>
>> Then I can have a list of differences between any two with any specified gap -
>> 'lag' it is called.
>> Using the same function.
>>
>> Now I have them and do that. Then I add them together to find the 'lastone'
>> which is the total difference for the period.
>>
>>
>> Now here's the point: that covers a period of two timespans, months, they are.
>>
>> if I want to cover a span of 24 months, say, then I would have to write this
>> diff() function 24 times.
>>
>> what I'm doing is finding the difference between the starting point and every
>> other point and then adding them all together. bit like finding the area
>> beneath the curve maybe.
>>
>> And that's what I want to do.
>>
>> :)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> To: arthur brogard <abrogard at yahoo.com>
>> Cc: r-help at r-project.org
>> Sent: Saturday, 24 December 2016, 15:34
>> Subject: Re: [R] Is there a funct to sum differences?
>>
>> You need to "reply all" so other people can help as well, and others can learn
>> from your questions.
>>
>> I am still puzzled by how you expect to compute "finalone". If you had supplied
>> numbers other than all 5's it might have been easier to figure out what is going
>> on.
>>
>> What is your purpose in performing this calculation?
>>
>> #### reproducible code
>> rates <- read.table( text =
>> "Date Int
>> Jan-1959 5
>> Feb-1959 5
>> Mar-1959 5
>> Apr-1959 5
>> May-1959 5
>> Jun-1959 5
>> Jul-1959 5
>> Aug-1959 5
>> Sep-1959 5
>> Oct-1959 5
>> Nov-1959 5
>> ", header = TRUE, colClasses = c( "character", "numeric" ) )
>>
>> #your code
>> rates$thisone <- c(diff(rates$Int), NA)
>> rates$nextone <- c(diff(rates$Int, lag=2), NA, NA) rates$lastone <-
>> (rates$thisone + rates$nextone)/6.5*1000 # I doubt there is a ready-built
>> function that knows you want to # divide by 6.5 or multiply by 1000
>>
>> # form a vector from positions 2:11 and append NA)
>> rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA ) # numbers that are not
>> all the same
>> rates$Int2 <- (1:11)^2
>> rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )
>>
>> # dput(rates)
>> result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959", "Apr-
>> 1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959", "Sep-1959", "Oct-
>> 1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), thisone = c(0, 0, 0, 0, 0,
>> 0, 0, 0, 0, 0, NA), nextone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA), lastone = c(0, 0, 0,
>> 0, 0, 0, 0, 0, 0, NA, NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121),
>> experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, NA), experiment2 = c(5,
>> 13, 25, 41, 61, 85, 113, 145, 181, 221, NA)), .Names = c("Date", "Int",
>> "thisone", "nextone", "lastone", "Int2", "experiment1", "experiment2"),
>> row.names = c(NA, -11L), class = "data.frame")
>>
>> On Sat, 24 Dec 2016, arthur brogard wrote:
>>
>>>
>>>
>>> Yes, sure, thanks for your interest. I apologise for not submitting in the
>> correct manner. I'll learn (I hope).
>>>
>>> Here's the source - a spreadsheet with just two columns, date and 'Int'.
>>>
>>>
>>> Date Int
>>> Jan-1959 5
>>> Feb-1959 5
>>> Mar-1959 5
>>> Apr-1959 5
>>> May-1959 5
>>> Jun-1959 5
>>> Jul-1959 5
>>> Aug-1959 5
>>> Sep-1959 5
>>> Oct-1959 5
>>> Nov-1959 5
>>>
>>>
>>> After processing it becomes this:
>>>
>>>
>>>> rates
>>> Date Int thisone nextone lastone finalone
>>> 1 1959-01-01 5.00 0.00 0.00 0.000000 10
>>> 2 1959-02-01 5.00 0.00 0.00 0.000000 10
>>> 3 1959-03-01 5.00 0.00 0.00 0.000000 10
>>> 4 1959-04-01 5.00 0.00 0.00 0.000000 10
>>> 5 1959-05-01 5.00 0.00 0.00 0.000000 10
>>> 6 1959-06-01 5.00 0.00 0.00 0.000000 10
>>>
>>> The one long column I'm referring to is the 'Int' column which R has imported.
>>>
>>> The actual code is:
>>>
>>>
>>> rates <- read.csv("Rates2.csv",header =
>>> TRUE,colClasses=c("character","numeric"))
>>>
>>> sapply(rates,class)
>>>
>>> rates$Date <- strptime(paste0("1-", rates$Date), format="%d-%b-%Y",
>>> tz="UTC")
>>>
>>>
>>> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>>> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>>> rates$nextone)/6.5*1000
>>>
>>>
>>> rates
>>>
>>>
>>>
>>> ab
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>>> To: arthur brogard <abrogard at yahoo.com>; arthur brogard via R-help
>>> <r-help at r-project.org>; "r-help at r-project.org" <r-help at r-project.org>
>>> Sent: Saturday, 24 December 2016, 13:25
>>> Subject: Re: [R] Is there a funct to sum differences?
>>>
>>> Could you make your example reproducible? That is, include some sample
>> input and output. You talk about a column of numbers and then you seem to
>> work with named lists and I can't reconcile your words with the code I see.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>>
>>> On December 23, 2016 3:40:18 PM PST, arthur brogard via R-help <r-help at r-
>> project.org> wrote:
>>>> I've been looking but I can't find a function to sum difference.
>>>>
>>>> I have this code:
>>>>
>>>>
>>>> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>>>> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>>>> rates$nextone)
>>>>
>>>>
>>>> It is looking down one long column of numbers.
>>>>
>>>> It sums the difference between the first two and then between the
>>>> first and third and so on.
>>>>
>>>> Can it be made to automatically sum the difference between the first
>>>> and subsequent to the end of a list?
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go Live...
>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
>> Live: OO#.. Dead: OO#.. Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list