[R] Generation of missiing values in a time serie...

Kjetil Brinchmann Halvorsen kjetilbrinchmannhalvorsen at gmail.com
Tue Dec 13 23:41:00 CET 2005


Gabor Grothendieck wrote:
> Yes, this is the definition of a time series and therefore of a zoo object.
> A time series is a mathematical function, i.e. it assigns a single element
> of the range to each element of the domain. This data does not describe
> a time series.

Since nobody else has mentiones it on this thread: Tha CRAN package
pastecs  has function `regul'  to regularize irregular time series.

maybe that is what the original poster want.

Kjetil


> 
> Also it has no underlying regularity as the warning message states.
> To use as.ts one wants a series with an underlying regularity that has
> gaps and then as.ts will fill in the gaps with NAs.
> 
> If we don't have an underlying regularity the question is not well posed
> but its likely we want to discretize time.  The  zoo command itself is
> somewhat forgiving, at least in this case, i.e. it allows one to specify
> this illegal zoo object with non-unique times for purposes of discretization;
> however, such a zoo object should not be used other than to get a legal
> zoo object out.
> 
> For example, in the following we round the times to one decimal place
> and then within each set of values at the same discretized time take the
> last one.  (Alternately specify mean instead of tail, 1 if the average
> is prefered.)  Then we convert that to a ts object:
> 
>> as.ts(aggregate(z, round(time(z), 1), tail, 1))
> Time Series:
> Start = c(123, 2)
> End = c(123, 8)
> Frequency = 10
>           time flow seq       ts     x      rtt size
> 123.1 123.1257    0 967 123.1257 13394 0.798205 1472
> 123.2 123.2411    0 969 123.2411 12680 0.796258 1472
> 123.3       NA   NA  NA       NA    NA       NA   NA
> 123.4       NA   NA  NA       NA    NA       NA   NA
> 123.5 123.4726    0 970 123.4726 12680 0.796258 1472
> 123.6 123.5886    0 971 123.5886 12680 0.796258 1472
> 123.7 123.7046    0 972 123.7046 12680 0.796258 1472
> 
> On 12/13/05, Alvaro Saurin <saurin at dcs.gla.ac.uk> wrote:
>> I think I have found the error. It appears when there are two entries
>> with the same time. Using as input file:
>>
>> --------- CUT --------
>> # Output format for PCKs:
>> # TIME FLOW P [+-] SEQ TS X RTT SIZE
>> #
>> 123.125683 0 P + 967 123.125683 13394 0.798205 1472
>> 123.241137 0 P + 968 123.241137 12680 0.796258 1472
>> 123.241137 0 P + 969 123.241137 12680 0.796258 1472
>> 123.472631 0 P + 970 123.472631 12680 0.796258 1472
>> 123.588613 0 P + 971 123.588613 12680 0.796258 1472
>> 123.704594 0 P + 972 123.704594 12680 0.796258 1472
>> --------- CUT --------
>>
>> I run fhe following code:
>>
>> --------- CUT --------
>> h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
>> h_names <- list ("time", "flow",  "seq", "ts", "x", "rtt", "size")
>>
>> pcks_file    <- pipe ("grep ' P ' data", "r")
>> pcks          <- scan (pcks_file, what = h_types, comment.char = '#',
>> fill = TRUE)
>> mat_df      <- data.frame (pcks[1:2], pcks[5:9])
>> mat           <- as.matrix (mat_df)
>> colnames (mat)      <- h_names
>> z <- zoo (mat, mat [,"time"])
>> --------- CUT --------
>>
>> The dput of 'z' shows:
>>
>> --------- CUT --------
>> structure(c(123.125683, 123.241137, 123.241137, 123.472631, 123.588613,
>> 123.704594, 0, 0, 0, 0, 0, 0, 967, 968, 969, 970, 971, 972, 123.125683,
>> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594, 13394,
>> 12680, 12680, 12680, 12680, 12680, 0.798205, 0.796258, 0.796258,
>> 0.796258, 0.796258, 0.796258, 1472, 1472, 1472, 1472, 1472, 1472
>> ), .Dim = c(6, 7), .Dimnames = list(c("1", "2", "3", "4", "5",
>> "6"), c("time", "flow", "seq", "ts", "x", "rtt", "size")), index =
>> structure(c(123.125683,
>> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594), .Names =
>> c("1",
>> "2", "3", "4", "5", "6")), class = "zoo")
>> --------- CUT --------
>>
>> If I try a 'as.ts(z)', it fails. If I remove the duplicate entry, I
>> can convert it to a TS with no problem. Is this made intentionally?
>> Because then I have to filter the input matrix... But, anyway, the
>> output matrix, after filtering, doesn't seem regular:
>>
>> --------- CUT --------
>>  > as.ts (z)
>> Time Series:
>> Start = 1
>> End = 5
>> Frequency = 1
>>       time flow seq       ts     x      rtt size
>> 1 123.1257    0 967 123.1257 13394 0.798205 1472
>> 2 123.2411    0 969 123.2411 12680 0.796258 1472
>> 3 123.4726    0 970 123.4726 12680 0.796258 1472
>> 4 123.5886    0 971 123.5886 12680 0.796258 1472
>> 5 123.7046    0 972 123.7046 12680 0.796258 1472
>> Warning message:
>> 'x' does not have an underlying regularity in: as.ts.zoo(z)
>> --------- CUT --------
>>
>> Weird...
>>
>>
>> On 13 Dec 2005, at 16:33, Gabor Grothendieck wrote:
>>
>>> Please provide a reproducible example.  Note that dput(x) will output
>>> an R object in a way that can be copied and pasted into another
>>> session.
>>>
>>> On 12/13/05, Alvaro Saurin <saurin at dcs.gla.ac.uk> wrote:
>>>> On 13 Dec 2005, at 13:08, Gabor Grothendieck wrote:
>>>>
>>>>> Your variable mat is not a matrix; its a data frame.  Check it with:
>>>>>
>>>>>    class(mat)
>>>>>
>>>>> Here is an example:
>>>>>
>>>>> x <- cbind(A = 1:4, B = 5:8)
>>>>> tt <- c(1, 3:4, 6)
>>>>>
>>>>> library(zoo)
>>>>> x.zoo <- zoo(x, tt)
>>>>> x.ts <- as.ts(x.zoo)
>>>> Fixed, but anyway it fails:
>>>>
>>>>>      h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
>>>>>      h_names <- list ("time", "flow", "seq", "ts", "x", "rtt",
>>>>> "size")
>>>>>      pcks_file       <- pipe ("grep ' P ' server.dat", "r")
>>>>>      pcks            <- scan (pcks_file, what = h_types,
>>>>                                        comment.char = '#', fill =
>>>> TRUE)
>>>>
>>>>>      mat_df                  <- data.frame (pcks[1:2], pcks[5:9])
>>>>>      mat                             <- as.matrix (mat_df)
>>>>>      colnames (mat)  <- h_names
>>>>>      class (mat)
>>>> [1] "matrix"
>>>>
>>>>>      z <- zoo (mat, mat [,"time"])
>>>>>      z
>>>>>      z
>>>>          time         flow         seq          ts
>>>> x            rtt          size
>>>> 1.0009       1.000893     0.000000     0.000000     1.000893
>>>> 1472.000000     0.000000  1472.000000
>>>> 1.5145       1.514454     0.000000     1.000000     1.514454
>>>> 2944.000000     0.513142  1472.000000
>>>> 2.0151       2.015093     0.000000     2.000000     2.015093
>>>> 2944.000000     0.513142  1472.000000
>>>> 2.515        2.515025     0.000000     3.000000     2.515025
>>>> 4806.000000     0.504488  1472.000000
>>>> 2.822        2.821976     0.000000     4.000000     2.821976
>>>> 5730.000000     0.496728  1472.000000
>>>> [...]
>>>>
>>>>>      as.ts (z)
>>>> Error in if (del == 0 && to == 0) return(to) :
>>>>        missing value where TRUE/FALSE needed
>>>>
>>>> Any idea? Thanks for your help.
>>>>
>>>> Alvaro
>>>>
>>>>
>>>> --
>>>> Alvaro Saurin <alvaro.saurin at gmail.com> <saurin at dcs.gla.ac.uk>
>>>>
>>>>
>>>>
>>>>
>> --
>> Alvaro Saurin <alvaro.saurin at gmail.com> <saurin at dcs.gla.ac.uk>
>>
>>
>>
>>
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list