[R] trouble with looping for effect of sampling interval increase

Sun Aug 5 16:08:25 CEST 2012

I've looked everywhere and tinkered for three days now, so I figure asking
might be good. 
So here's a general rundown of what I am trying to get my code to do I am
giving you the whole rundown because I need a solution that retain certain
ways of doing things because they give me the information i need. 
I want to examine the effect of increasing my sampling interval on my data.
Example: what if instead of sampling every hour I sampled every two, oh
yeah, how about every three?.. etc ad nausea.  How I want to do this is to
take the data I have now, add an index  to it, that contains counters. Those
counters will look something like 1,2,1,2,.. for the first one,
1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand...
Then for each column in the index my loops should start in the first column,
run only the ones, store that, then run the twos, and store that in the same
column of output in a different row. Then move to the next column run the
ones, store in the next column of output, run the twos, store in the next
row of that column, run the threes, etc on out until there is no more. I
want to use this index for a number of reasons. The first is that after this
I will be going back through and using a different method for sub-sampling
but keeping all else the same. So all I have to do there is change the way I
generate the index. The second is that it allows me to run  many subsamples
and see their range.  So the code I have made, generates my index, and does
the heavy lifting all correctly, as well as my averages, and quartiles, but
a look at the head () of my key output (IntervalBetas)  shows that something
has gone a miss. You have to look close to catch it.  The values generated
for each row of output are identical, this should not be the case, as row
one of the first output column should be generated from all values indexed
by a one in the first column, whereas in column two there are different
values indexed by the number one. I've checked about everything I can think
of, done print() on my loop sequence things (those little i and j) and
wiggled about everything. I am flummoxed. I think the bit that is messing up
is in here :
#Here is the loop for betas from sampling interval increase
 c <- WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x <- length(unique(index[,i]))

 for (j in 1:x) 
 {

 data <- WHOLE [WHOLE[,x]==j,1]

But also here is the whole code in case I am wrong that that is the problem
area: 

#loop for making index

 #clean dataset of empty cells
 dataset <- na.omit (datasetORIGINAL)
 #how messed up was the data?
 holeyDATA <- datasetORIGINAL - dataset

 D <- dim(dataset)

#what is the smallest sample? 
tinysample <- 100 

#how long is the dataset?
 datalength <- length (dataset)

 #MD <- how many divisions

MD <- datalength/tinysample

 #clear things up for the index loop
 WHOLE <- NULL
index <- NULL
 #do the index loop

 for (a in 1:MD)
 {
 index <- cbind (index, rep (1:a, length = D[1]))
 }
index <- subset(index, select = -c(1) )

 #merge dataset and index loop
 WHOLE <- cbind (dataset, index)

 WHOLESIZE <- dim (WHOLE)

#Housekeeping before loops
IntervalBetas <- NULL

IntervalBetas <- c(NA,NA)
IntervalBetas <- as.data.frame (IntervalBetas)
IntervalLowerQ <- NULL
IntervalUpperQ <- NULL
IntervalMean <- NULL
IntervalMedian <- NULL

#Here is the loop for betas from sampling interval increase
 c <- WHOLESIZE[2]-1
 for (i in 1:c)
 {
 x <- length(unique(index[,i]))

 for (j in 1:x) 
 {

 data <- WHOLE [WHOLE[,x]==j,1]

 #get power spectral density

 PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE)
 frequency <- PSDPLOT$freq
 PSD <- PSDPLOT$spec
 #log transform the power spectral density 
 Logfrequency <- log(frequency)
 LogPSD<- log(PSD)
 #fit my line to the data 
 Line <- lm (LogPSD ~ Logfrequency)
 #store the slope of the line
 Betas <- rbind (Betas, -coef(Line)[2])

#Get values on the curve shape
BSkew <- skew (Betas)
BMean <- mean (Betas)
BMedian <- median (Betas)
Q <- quantile (Betas) 

#store curve shape values
IntervalLowerQ <- rbind (IntervalLowerQ , Q[2]) 
IntervalUpperQ <- rbind (IntervalUpperQ , Q[4]) 
IntervalSkew <- rbind (IntervalSkew , BSkew) 
IntervalMean <- rbind (IntervalMean , BMean)
IntervalMedian <- rbind (IntervalMedian , BMedian)

#Store the Betas
#This is a pain

BetaSave <- Betas 
no.r <- nrow(IntervalBetas)
l.v <- length(BetaSave)
difer <- no.r - l.v
difers <- abs(difer)
if (no.r < l.v){ 
IntervalBetas <- rbind(IntervalBetas,rep(NA,difers))
}
else {
(BetaSave <- rbind(BetaSave,rep(NA,difers)))
}

IntervalBetas <- cbind (IntervalBetas, BetaSave)

 }

 }

#That ends the loop within a loop for how sampling interval
#changes beta
head (IntervalBetas)

--
View this message in context: http://r.789695.n4.nabble.com/trouble-with-looping-for-effect-of-sampling-interval-increase-tp4639213.html
Sent from the R help mailing list archive at Nabble.com.