[R] Using PCA to filter a series

David L Carlson dcarlson at tamu.edu
Thu Oct 2 23:59:13 CEST 2014


I think you want to convert your principal component to the same scale as d1, d2, d3, and d4. But the "original space" is a 4-dimensional space in which d1, d2, d3, and d4 are the axes, each with its own mean and standard deviation. Here are a couple of possibilities

# plot original values for comparison
> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5)
# standardize the pc scores to the grand mean and sd
> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, d4))
> lines(new1)
# Use least squares regression to predict the row means for the original four variables
> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1]))
> lines(new2, col="red")

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Don McKenzie
Sent: Thursday, October 2, 2014 4:39 PM
To: Jonathan Thayn
Cc: r-help at r-project.org
Subject: Re: [R] Using PCA to filter a series


On Oct 2, 2014, at 2:29 PM, Jonathan Thayn <jthayn at ilstu.edu> wrote:

> Hi Don. I would like to "de-rotate� the first component back to its original state so that it aligns with the original time-series. My goal is to create a �cleaned�, or a �model� time-series from which noise has been removed. 

Please cc the list with replies. It�s considered courtesy plus you�ll get more help that way than just from me.

Your goal sounds almost metaphorical, at least to me.  Your first axis �aligns� with the original time series already in that it captures the dominant variation
across all four. Beyond that, there are many approaches to signal/noise relations within time-series analysis. I am not a good source of help on these, and you probably need a statistical consult (locally?), which is not the function of this list.

> 
> 
> Jonathan Thayn
> 
> 
> 
> On Oct 2, 2014, at 2:33 PM, Don McKenzie <dmck at u.washington.edu> wrote:
> 
>> 
>> On Oct 2, 2014, at 12:18 PM, Jonathan Thayn <jthayn at ilstu.edu> wrote:
>> 
>>> I have four time-series of similar data. I would  like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp
>>> 
>>> 
>>> d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113)
>>> d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138)
>>> d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138)
>>> d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134)
>>> 
>>> 
>>> pca <- princomp(cbind(d1,d2,d3,d4))
>>> plot(pca$scores[,1])
>>> 
>>> This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that?
>> 
>> Do you mean that you want to scale the scores on Axis 1 to the mean and range of your raw data?  Or their mean and variance?
>> 
>> See
>> 
>> ?scale
>>> 
>>> 
>>> 
>>> 
>>> Jonathan B. Thayn
>>>      
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> Don McKenzie
>> Research Ecologist
>> Pacific WIldland Fire Sciences Lab
>> US Forest Service
>> 
>> Affiliate Professor
>> School of Environmental and Forest Sciences 
>> College of the Environment
>> University of Washington
>> dmck at uw.edu
> 

Don McKenzie
Research Ecologist
Pacific WIldland Fire Sciences Lab
US Forest Service

Affiliate Professor
School of Environmental and Forest Sciences 
College of the Environment
University of Washington
dmck at uw.edu


        [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list