[R] Time series data with dropouts/gaps

Mike Marchywka marchywka at hotmail.com
Tue Oct 26 14:10:38 CEST 2010








----------------------------------------
> From: ggrothendieck at gmail.com
> Date: Tue, 26 Oct 2010 00:37:05 -0400
> To: FlyMyPG at gmail.com
> CC: r-help at r-project.org
> Subject: Re: [R] Time series data with dropouts/gaps
>
> On Tue, Oct 26, 2010 at 12:28 AM, Bob Cunningham  wrote:
> > I have time-series data from a pair of inexpensive self-logging 3-axis
> > accelerometers (http://www.gcdataconcepts.com/xlr8r-1.html).  Since I'm not
> > sure of the vibration/shock spectrum I'm measuring, for my initial sensor
> > characterization run the units were mounted together with the sample rate
> > set to the maximum of 640 samples/sec.
> >
> > Unfortunately, at this sample rate there are significant data dropouts at
> > various scales (a phenomenon not present at data rates of 160 Hz and below):
> >
> > 1. Approximately every 20ms, a few samples are dropped (believed to be due
> > to internal buffer wrapping).
> >
> > 2. Approximately every 200ms, about 50 samples are dropped (believed to be
> > due to flash write times).
> >
> > 3. At seemingly random intervals, a sample will appear with an out-of-order
> > timestamp (vendor is diagnosing).
> >
> > Initially, I'm trying to answer the following questions:
> >
> > A. How well do the 2 units compare?  (Calibration, time-base drift, etc.)
> >
> > B. Can I use a lower sample rate?  (What is the observed spectrum?)
> >
> > I started attacking the problem in Python (numpy/scipy), where I've done
> > lots of prior time-series sensor data analysis.  Unfortunately, the gaps
> > have made direct use of the data futile, and I found I was spending all my
> > time manipulating Python lists and numpy vectors rather than finding
> > answers.
> >
> > I hope R can help calm my sea of unruly data.  I'm presently working my way
> > through the abundant R references (tutorials, wiki, etc.), but I was hoping
> > to find pointers here to help me become productive sooner rather than later.
> >
> > Here's my present brute-force plan of attack:
> >
> > - Load both data sets (in CSV format).  Each data element is a timestamp +
> > 3-axis acceleration.
> > - Determine timebase offset: The unit clocks don't match perfectly, and the
> > units were started at slightly different times, so I expect to correlate
> > common events in the data.
> > - Find all overlapping data clusters (between superset of gaps).
> > - See if I have enough data to perform spectral analysis.  I'd like to
> > analyze all clusters together, but I suspect I may have to analyze them
> > independently, then combine the results.
> >
> > Thoughts?  Hints?

Is this a question about R or DSP? I think spectral analysis on non-uniformly
sampled data was covered in Oppenheim and Shafer or equivalent texts from this century.
I guess you could use sinc interpolation if you really want to make up data
although I should probably read the zoo documentation before commenting further :)

I guess my thought at this point would be some simple paper-pencil
analysis to see what time base pertrubations do in time and FT domains
and then look for related R functions. If you force these at a single
frequency, you may be able to get some idea what is going on with your
apparent spectrum. It may help of course if you have known good
data ( you can generate this in R ) and perturb it ( remove samples, jitter
the sample times etc ) and verifiy that your analysis can back-out the
problems you introduced. fwiw.


> >
>
> You can use read.zoo in the zoo package to create a zoo time series
> from a csv file. The zoo merge method can merge two or more series
> together and na.locf, na.approx or na.spline, also in zoo, could be
> used to fill in the NAs. There are three vignettes (pdf documents)
> that come with the zoo package that will get you up to speed.

See comment above, I haven't read docs but name suggests spline when
probably the OP is looking for something like sinc interpolation, 



>

 		 	   		  


More information about the R-help mailing list