[R] Survey Design / Rake questions

Farley, Robert FarleyR at metro.net
Mon Aug 25 18:33:37 CEST 2008


Still no joy.  :-(


I see a number of things that bother me.
  1) str(ByEBNum$StnTraveld) says "int [1:12] 1 2 3 4 5 6 7 8 9 10 ..." 
         Even though "StnTraveld  <- c(as.factor(1:12))"
  2) ByEBOn$StnName[1:5] seems to imply I have extra spaces in the data.  Where would they have come from?
  3) I'd like to verify that the order (value) of "EBSurvey$lineon" matches my definition in "StnName"
  

Thanks for helping...


***************************************************************************
***************************************************************************
> library(survey)
> SurveyData <- read.spss("C:/Data/R/orange_delivery.sav", use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
> #===============================================================================
> temp <- sub(' +$', '', SurveyData$direction_) 
> SurveyData$direction_ <- temp
> #===============================================================================
> SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyData$lineoff))
> mean(SurveyData$NumStn)
[1] 6.785276
> ### Kludge
> SurveyData$NumStn <- pmax(1,SurveyData$NumStn)
> mean(SurveyData$NumStn)
[1] 6.789877
> SurveyData$NumStn <- as.factor(SurveyData$NumStn)
> ###
> EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )
> XTTable <- xtabs(~direction_ , EBSurvey)
> XTTable
direction_
EASTBOUND 
      345 
> WBSurvey <- subset(SurveyData, direction_ == "WESTBOUND" )
> XTTable <- xtabs(~direction_ , WBSurvey)
> XTTable
direction_
WESTBOUND 
      307 
> #
> EBDesign <- svydesign(id=~sampn, weights=~expwgt, data=EBSurvey)
> #   svytable(~lineon+lineoff, EBDesign)
> StnName     <- c( "Warner Center", "De Soto", "Pierce College", "Tampa", "Reseda", "Balboa", "Woodley", "Sepulveda", "Van Nuys", "Woodman", "Valley College", "Laurel Canyon", "North Hollywood")
> EBOnNewTots <- c(            1000,       600,             1200,     500,     1000,      500,       200,         250,       1000,       300,              100,          123.65,                0 )
> StnTraveld  <- c(as.factor(1:12))
> EBNumStn    <- c(673.65,     800, 1000, 1000,  800,  700,  600, 500, 400, 200,  50, 50 )
> ByEBOn  <- data.frame(StnName,   Freq=EBOnNewTots)
> ByEBNum <- data.frame(StnTraveld, Freq=EBNumStn)
> RakedEBSurvey <- rake(EBDesign, list(~lineon, ~NumStn), list(ByEBOn, ByEBNum) )
Error in postStratify.survey.design(design, strata[[i]], population.margins[[i]],  : 
  Stratifying variables don't match
> 
> str(EBSurvey$lineon)
 Factor w/ 13 levels "Warner Center",..: 3 1 1 1 2 13 1 5 1 5 ...
> EBSurvey$lineon[1:5]
[1] Pierce College Warner Center  Warner Center  Warner Center  De Soto       
13 Levels: Warner Center De Soto Pierce College Tampa Reseda Balboa ... North Hollywood
> str(ByEBOn$StnName)
 Factor w/ 13 levels "Balboa","De Soto",..: 11 2 5 8 6 1 12 7 10 13 ...
> ByEBOn$StnName[1:5]
[1] Warner Center  De Soto        Pierce College Tampa          Reseda        
13 Levels: Balboa De Soto Laurel Canyon North Hollywood ... Woodman
> 
> str(EBSurvey$NumStn)
 Factor w/ 12 levels "1","2","3","4",..: 10 12 4 12 8 1 8 8 12 4 ...
> EBSurvey$NumStn[1:5]
[1] 10 12 4  12 8 
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
> str(ByEBNum$StnTraveld)
 int [1:12] 1 2 3 4 5 6 7 8 9 10 ...
> ByEBNum$StnTraveld[1:5]
[1] 1 2 3 4 5
>
********************************************************************************************************************************************************

Robert Farley
Metro
www.Metro.net 


-----Original Message-----
From: Thomas Lumley [mailto:tlumley at u.washington.edu] 
Sent: Saturday, August 23, 2008 09:38
To: Farley, Robert
Cc: r-help at r-project.org
Subject: Re: [R] Survey Design / Rake questions

On Fri, 22 Aug 2008, Farley, Robert wrote:

> I *think* I'm making progress, but I'm still failing at the same step.  My rake call fails with:
> Error in postStratify.survey.design(design, strata[[i]], population.margins[[i]],  :
>  Stratifying variables don't match
>
> To my naïve eyes, it seems that my factors are "in the wrong order".  If so, 
>how do I "assert" an ordering in my survey dataframe, or copy an "image" from 
>the survey dataframe to my marginals dataframes?  I'd prefer to "pull" the 
>original marginals dataframe(s) from the survey dataframe so that I can 
>automate that in production.

It looks like a problem with the NumStn factor. One copy has been converted to character and then factor, giving levels in alphabetical order; the other copy has been converted directly to factor, giving levels in numerical order.

If you use as.factor(1:12) rather than as.character(1:12) it should work.

      -thomas



> If that's not my problem, where might I look for enlightenment?  Neither "?why" nor ?whatamimissing return citations.  :-)
>



More information about the R-help mailing list