[R] dividing ts objects of different frequencies

Jeffrey J. Hallman jhallman at frb.gov
Thu Mar 5 15:54:25 CET 2009


"Stephen J. Barr" <stephenjbarr at gmail.com> writes:
> I have two time series (ts) objects, 1 is yearly (population) and the
> other is quarterly (bankruptcy statistics). I would like to produce a
> quarterly time series object that consists of bankruptcy/population.
> Is there a pre-built function to intelligently divide these time
> series:

What you need to do is create a quarterly population series, then divide it into
your bankruptcy series.  The only "nice" way I know to do this is to use the
convert() function from my "tis" package.  Here is it's help document:

convert                 package:tis                 R Documentation

Time scale conversions for time series

Description:

     Convert 'tis' series from one frequency to another using a variety
     of algorithms.

Usage:

     convert(x, tif, method = "constant", observed. = observed(x),
             basis. = basis(x), ignore = F)

Arguments:

       x: a univariate or multivariate 'tis' series. Missing values
          (NAs) are ignored.  

     tif: a number or a string indicating the desired ti frequency of
          the return series. See 'help(ti))' for details. 

  method: method by which the conversion is done: one of "discrete",
          "constant", "linear", or "cubic".  Note that this argument is
          effectively ignored if 'observed.' is "high" or "low", as the
          "discrete" method is the only one supported for that setting.            

observed.: "observed" attribute of the input series: one of
          "beginning", "end", "high", "low", "summed", "annualized", or
          "averaged".  If this argument is not supplied and
          observed('x') != NULL it will be used.  The output series
          will also have this "observed" attribute. 

  basis.: "daily" or "business".  If this argument is not supplied and
          basis('x') != NULL it will be used. The output series will
          also have this "basis" attribute.  

  ignore: governs how missing (partial period) values at the beginning
          and/or end of the series are handled.  For method ==
          "discrete" or "constant" and ignore == T, input values that
          cover only part the first and/or last output time intervals
          will still result in output values for those intervals.  This
          can be problematic, especially for observed == "summed", as
          it can lead to atypical values for the first and/or last
          periods of the output series. 

Details:

     This function is a close imitation of the way FAME handles time
     scale conversions.  See the chapter on "Time Scale Conversion" in
     the Users Guide to Fame if the explanation given here is not
     detailed enough.

     Start with some definitions.  Combining values of a higher
     frequency input series to create a lower frequency output series
     is known as 'aggregation'. Doing the opposite is known as
     'disaggregation'.

     If observed == "high" or "low", the "discrete" method is always
     used.

     Disaggration for "discrete" series: (i) for observed ==
     "beginning" ("end"), the first (last) output period that begins
     (ends) in a particular input period is assigned the value of that
     input period. All other output periods that begin (end) in that
     input period are NA. (ii) for observed == "high", "low", "summed"
     or "averaged", all output periods that end in a particular input
     period are assigned the same value.  For "summed", that value is
     the input period value divided by the number of output periods
     that end in the input period, while for "high", "low" and
     "averaged" series, the output period values are the same as the
     corresponding input period values.  

     Aggregation for "discrete" series: (i) for observed == "beginning"
     ("end"), the output period is assigned the value of the first
     (last) input period that begins (ends) in the output period. (ii)
     for observed == "high" ("low"), the output period is assigned the
     value of the maximum (minimum) of all the input values for periods
     that end in the output period. (iii) for observed == "summed"
     ("averaged"), the output value is the sum (average) of all the
     input values for periods that end in the output period.

     Methods "constant", "linear", and "cubic" all work by constructing
     a continuous function F(t) and then reading off the appropriate
     point-in-time values if observed == "beginning" or "end", or by
     integrating F(t) over the output intervals when observed ==
     "summed", or by integrating F(t) over the output intervals and
     dividing by the lengths of those intervals when observed ==
     "averaged".  The unit of time itself is given by the 'basis'
     argument. 

     The form of F(t) is determined by the conversion method. For
     "constant" conversions, F(t) is a step function with jumps at the
     boundaries of the input periods.  If the first and/or last input
     periods only partly cover an output period, F is linearly extended
     to cover the first and last output periods as well.  The heights
     of the steps are set such that F(t) aggregates over the input
     periods to the original input series. 

     For "linear" ("cubic") conversions, F(t) is a linear (cubic)
     spline. The x-coordinates of the spline knots are the beginnings
     or ends of the input periods if observed == "beginning" or "end",
     else they are the centers of the input periods. The y-coordinates
     of the splines are chosen such that aggregating the resulting F(t)
     over the input periods yields the original input series. 

     For "constant" conversions, if 'ignore' == F, the first (last)
     output period is the first (last) one for which complete input
     data is available.  For observed == "beginning", for example, this
     means that data for the first input period that begins in the
     first output period is available, while for observed == "summed",
     this means that the first output period is completely contained
     within the available input periods.  If 'ignore' == T, data for
     only a single input period is sufficient to create an output
     period value.  For example, if converting weekly data to monthly
     data, and the last observation is June 14, the output series will
     end in June if 'ignore' == T, or May if it is F. 

     Unlike the "constant" method, the domain of F(t) for "linear" and
     "cubic" conversions is NOT extended beyond the input periods, even
     if the ignore option is T. The first (last) output period is
     therefore the first (last) one that is completely covered by input
     periods.

     Series with observed == "annualized" are handled the same as
     observed == "averaged".

Value:

     a 'tis' time series covering approximately the same time span as
     'x', but with the frequency specified by 'tif'.

BUGS:

     Method "cubic" is not currently implemented for observed "summed",
     "annualized", and "averaged".

References:

     Users Guide to Fame

See Also:

     'aggregate', 'tif', 'ti'

Examples:

     wSeries <- tis(1:105, start = ti(19950107, tif = "wsaturday"))
     observed(wSeries) <- "ending"   ## end of week values
     mDiscrete <- convert(wSeries, "monthly", method = "discrete")
     mConstant <- convert(wSeries, "monthly", method = "constant")
     mLinear   <- convert(wSeries, "monthly", method = "linear")
     mCubic    <- convert(wSeries, "monthly", method = "cubic")

     ## linear and cubic are identical because wSeries is a pure linear trend
     cbind(mDiscrete, mConstant, mLinear, mCubic)

     observed(wSeries) <- "averaged"   ## weekly averages
     mDiscrete <- convert(wSeries, "monthly", method = "discrete")
     mConstant <- convert(wSeries, "monthly", method = "constant")
     mLinear   <- convert(wSeries, "monthly", method = "linear")

     cbind(mDiscrete, mConstant, mLinear)



-- 
Jeff




More information about the R-help mailing list