[R] trouble with looping for effect of sampling interval increase
White, William Patrick
white.232 at wright.edu
Tue Aug 7 16:41:06 CEST 2012
My apologies, here is a sample dataset generator:
#Running sum Test Data
Coin <- c(-1,1)
flips=sample(Coin, 1000, replace=T)
Runningsum <-cumsum (flips)
#A deactivated plot
#plot (Runningsum)
Test <- cbind (Runningsum)
datasetORIGINAL <- cbind (Runningsum)
________________________________________
From: Jean V Adams [jvadams at usgs.gov]
Sent: Monday, August 06, 2012 1:33 PM
To: White, William Patrick
Cc: r-help at r-project.org
Subject: Re: [R] trouble with looping for effect of sampling interval increase
You would make it much easier for R-help readers to solve your problem if you provided a small example data set with your code, so that we could reproduce your results and troubleshoot the issues.
Jean
Naidraug <white.232 at wright.edu> wrote on 08/05/2012 09:08:25 AM:
>
> I've looked everywhere and tinkered for three days now, so I figure asking
> might be good.
> So here's a general rundown of what I am trying to get my code to do I am
> giving you the whole rundown because I need a solution that retain certain
> ways of doing things because they give me the information i need.
> I want to examine the effect of increasing my sampling interval on my data.
> Example: what if instead of sampling every hour I sampled every two, oh
> yeah, how about every three?.. etc ad nausea. How I want to do this is to
> take the data I have now, add an index to it, that contains counters. Those
> counters will look something like 1,2,1,2,.. for the first one,
> 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand...
> Then for each column in the index my loops should start in the first column,
> run only the ones, store that, then run the twos, and store that in the same
> column of output in a different row. Then move to the next column run the
> ones, store in the next column of output, run the twos, store in the next
> row of that column, run the threes, etc on out until there is no more. I
> want to use this index for a number of reasons. The first is that after this
> I will be going back through and using a different method for sub-sampling
> but keeping all else the same. So all I have to do there is change the way I
> generate the index. The second is that it allows me to run many subsamples
> and see their range. So the code I have made, generates my index, and does
> the heavy lifting all correctly, as well as my averages, and quartiles, but
> a look at the head () of my key output (IntervalBetas) shows that something
> has gone a miss. You have to look close to catch it. The values generated
> for each row of output are identical, this should not be the case, as row
> one of the first output column should be generated from all values indexed
> by a one in the first column, whereas in column two there are different
> values indexed by the number one. I've checked about everything I can think
> of, done print() on my loop sequence things (those little i and j) and
> wiggled about everything. I am flummoxed. I think the bit that is messing up
> is in here :
> #Here is the loop for betas from sampling interval increase
> c <- WHOLESIZE[2]-1
> for (i in 1:c)
> {
> x <- length(unique(index[,i]))
>
> for (j in 1:x)
> {
>
> data <- WHOLE [WHOLE[,x]==j,1]
>
> But also here is the whole code in case I am wrong that that is the problem
> area:
>
> #loop for making index
>
>
> #clean dataset of empty cells
> dataset <- na.omit (datasetORIGINAL)
> #how messed up was the data?
> holeyDATA <- datasetORIGINAL - dataset
>
> D <- dim(dataset)
>
> #what is the smallest sample?
> tinysample <- 100
>
>
>
>
> #how long is the dataset?
> datalength <- length (dataset)
>
>
> #MD <- how many divisions
>
> MD <- datalength/tinysample
>
> #clear things up for the index loop
> WHOLE <- NULL
> index <- NULL
> #do the index loop
>
> for (a in 1:MD)
> {
> index <- cbind (index, rep (1:a, length = D[1]))
> }
> index <- subset(index, select = -c(1) )
>
> #merge dataset and index loop
> WHOLE <- cbind (dataset, index)
>
> WHOLESIZE <- dim (WHOLE)
>
> #Housekeeping before loops
> IntervalBetas <- NULL
>
>
> IntervalBetas <- c(NA,NA)
> IntervalBetas <- as.data.frame (IntervalBetas)
> IntervalLowerQ <- NULL
> IntervalUpperQ <- NULL
> IntervalMean <- NULL
> IntervalMedian <- NULL
>
> #Here is the loop for betas from sampling interval increase
> c <- WHOLESIZE[2]-1
> for (i in 1:c)
> {
> x <- length(unique(index[,i]))
>
> for (j in 1:x)
> {
>
> data <- WHOLE [WHOLE[,x]==j,1]
>
>
>
>
> #get power spectral density
>
> PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE)
> frequency <- PSDPLOT$freq
> PSD <- PSDPLOT$spec
> #log transform the power spectral density
> Logfrequency <- log(frequency)
> LogPSD<- log(PSD)
> #fit my line to the data
> Line <- lm (LogPSD ~ Logfrequency)
> #store the slope of the line
> Betas <- rbind (Betas, -coef(Line)[2])
>
> #Get values on the curve shape
> BSkew <- skew (Betas)
> BMean <- mean (Betas)
> BMedian <- median (Betas)
> Q <- quantile (Betas)
>
>
> #store curve shape values
> IntervalLowerQ <- rbind (IntervalLowerQ , Q[2])
> IntervalUpperQ <- rbind (IntervalUpperQ , Q[4])
> IntervalSkew <- rbind (IntervalSkew , BSkew)
> IntervalMean <- rbind (IntervalMean , BMean)
> IntervalMedian <- rbind (IntervalMedian , BMedian)
>
> #Store the Betas
> #This is a pain
>
>
> BetaSave <- Betas
> no.r <- nrow(IntervalBetas)
> l.v <- length(BetaSave)
> difer <- no.r - l.v
> difers <- abs(difer)
> if (no.r < l.v){
> IntervalBetas <- rbind(IntervalBetas,rep(NA,difers))
> }
> else {
> (BetaSave <- rbind(BetaSave,rep(NA,difers)))
> }
>
> IntervalBetas <- cbind (IntervalBetas, BetaSave)
>
>
> }
>
> }
>
> #That ends the loop within a loop for how sampling interval
> #changes beta
> head (IntervalBetas)
More information about the R-help
mailing list