[R] How do I read a text (.csv) file to match a matrix/cross tab? (Object confusion??)

Kellie Wills kellie at revolution-computing.com
Thu Nov 6 01:41:20 CET 2008


read.table doesn't realize the first column should be row names.  Try

read.table("C:/Data/R/NewTarget.csv", header=TRUE, sep=",",  
na.strings="NA", dec=".", row.names="tripid_nu")

Kellie Wills
Engineering Service Manager
REvolution Computing
kellie at revolution-computing.com


On Nov 5, 2008, at 3:51 PM, Farley, Robert wrote:

> I'm having a problem reading data to set control totals for a  
> dataframe.
> I want to adjust a dataframe based on a 2-d table of values, which I  
> get
> by using :
>
>> CurrentX1Sums <- as.matrix(xtabs(~tripid_nu+lineon, data=SurveyData))
>
>
>> CurrentX2Sums <- apply(CurrentX1Sums, 1, sum)
>
>
>
> I've created a .csv file with new (target) sums that looks like this:
>
> tripid_nu    Warner Center     De Soto      Pierce College      Tampa
> Reseda        Balboa          Woodley     Sepulveda   Van Nuys    
> Woodman
> Valley College      Laurel Canyon          North Hollywood
>
> 9011880     5        2        2        2        2        2        2
> 2        2        2        6        4        1
>
> 9011890     1        1        1        1        1        1        2
> 1        1        1        1        2        1
>
> 9011960     2        2        2        1        2        2        1
> 2        3        2        2        1        1
>
> 9011970     1        1        1        1        2        1        1
> 2        6        2        2        2        24
>
> 9012040     2        2        2        3        2        7        2
> 2        2        1        1        1        1
>
> 9012050     1        1        1        1        1        1        1
> 2        2        2        1        1        5
>
>  ...{More}...
>
>
>
> I'm trying to read/process it like this:
>
>> NewTargetData  <- read.table("C:/Data/R/NewTarget.csv", header=TRUE,
> sep=",", na.strings="NA", dec=".")
>
>> NewTargetX1Sums <- as.matrix(NewTargetData)
>
>> NewTargetX2Sums <- apply(NewTargetX1Sums, 1, sum)
>
>
>
>
>
> The structures of CurrentX1Sums and NewTargetX1Sums are different:
>
>
>
>> str(CurrentX1Sums)
>
> xtabs [1:55, 1:13] 1 0 1 0 1 0 0 0 0 1 ...
>
> - attr(*, "dimnames")=List of 2
>
>  ..$ tripid_nu: chr [1:55] "9011880" "9011890" "9011960" "9011970" ...
>
>  ..$ lineon   : chr [1:13] "Warner Center" "De Soto" "Pierce College"
> "Tampa" ...
>
> - attr(*, "class")= chr [1:2] "xtabs" "table"
>
> - attr(*, "call")= language xtabs(formula = ~tripid_nu + lineon,  
> data =
> SurveyData)
>
>
>
>> str(NewTargetX1Sums)
>
> int [1:55, 1:14] 9011880 9011890 9011960 9011970 9012040 9012050
> 9012130 9012280 9012290 9012720 ...
>
> - attr(*, "dimnames")=List of 2
>
>  ..$ : NULL
>
>  ..$ : chr [1:14] "tripid_nu" "Warner.Center" "De.Soto"
> "Pierce.College" ...
>
>>
>
>
>
>
>
>
>
>
>
> Question 1) The structures (CurrentX1Sums , NewTargetX1Sums) are
> different.  One way is in the dimension of the rownames.  Instead of
> line numbers, I want  tripid_nu.   How do I do that?  What's the
> appropriate "structure" for both?
>
>
>
>
>
> Question 2) Why do the labels in NewTargetData have dots in place of
> spaces?  Will that be a problem later when I try to match them with
> SurveyData?
>
>
>
>
>
> Question 3) Ultimately, I want to create a variable in the original
> dataframe like:
>
>           SurveyData$NewX1 = TargetX1Sums/ CurrentX1Sums { for each
> tripid_nu, lineon combination}
>
> Am I on the right track to do so?  Any hints on what THAT syntax will
> look like?
>
>
>
>
>
>
>
>
>
> Thanks in advance,
>
>
>
>
>
>
>
>
>
>
>
> ########################################################################
> ######################################
>
> #My work to date:
>
>> SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
> use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
>
>> NewTargetData  <- read.table("C:/Data/R/NewTarget.csv", header=TRUE,
> sep=",", na.strings="NA", dec=".")
>
>>
> #-----------------------------------------------------------------------
> --------
>
>> temp <- sub(' +$', '', SurveyData$direction_)       # Remove spaces
> from variable names
>
>> SurveyData$direction_ <- temp
>
>>
> #-----------------------------------------------------------------------
> --------
>
>> SurveyData$StnNum=as.numeric(SurveyData$lineon)
>
>> CurrentX1Sums <- as.matrix(xtabs(~tripid_nu+lineon, data=SurveyData))
>
>
>> CurrentX2Sums <- apply(CurrentX1Sums, 1, sum)
>
>> NewTargetX1Sums <- as.matrix(NewTargetData)
>
>> NewTargetX2Sums <- apply(NewTargetX1Sums, 1, sum)
>
>>
>
>> str(CurrentX1Sums)
>
> xtabs [1:55, 1:13] 1 0 1 0 1 0 0 0 0 1 ...
>
> - attr(*, "dimnames")=List of 2
>
>  ..$ tripid_nu: chr [1:55] "9011880" "9011890" "9011960" "9011970" ...
>
>  ..$ lineon   : chr [1:13] "Warner Center" "De Soto" "Pierce College"
> "Tampa" ...
>
> - attr(*, "class")= chr [1:2] "xtabs" "table"
>
> - attr(*, "call")= language xtabs(formula = ~tripid_nu + lineon,  
> data =
> SurveyData)
>
>> str(NewTargetX1Sums)
>
> int [1:55, 1:14] 9011880 9011890 9011960 9011970 9012040 9012050
> 9012130 9012280 9012290 9012720 ...
>
> - attr(*, "dimnames")=List of 2
>
>  ..$ : NULL
>
>  ..$ : chr [1:14] "tripid_nu" "Warner.Center" "De.Soto"
> "Pierce.College" ...
>
>>
>
>
>
>> CurrentX1Sums
>
>         lineon
>
> tripid_nu Warner Center De Soto Pierce College Tampa Reseda Balboa
> Woodley Sepulveda Van Nuys Woodman Valley College Laurel Canyon North
> Hollywood
>
>  9011880             1       0              2     1      0      2
> 1         0        0       0              1             0
> 0
>
>  9011890             0       0              0     0      0      0
> 1         0        0       0              0             1
> 0
>
>  9011960             1       1              2     0      1      1
> 0         1        3       2              1             0
> 0
>
>  9011970             0       0              0     0      1      0
> 0         1        6       1              1             1
> 14
>
> ...{More}...
>
>> NewTargetX1Sums
>
>      tripid_nu Warner.Center De.Soto Pierce.College Tampa Reseda  
> Balboa
> Woodley Sepulveda Van.Nuys Woodman Valley.College Laurel.Canyon
> North.Hollywood
>
> [1,]   9011880             5       2              2     2       
> 2      2
> 2         2        2       2              6             4
> 1
>
> [2,]   9011890             1       1              1     1       
> 1      1
> 2         1        1       1              1             2
> 1
>
> [3,]   9011960             2       2              2     1       
> 2      2
> 1         2        3       2              2             1
> 1
>
> [4,]   9011970             1       1              1     1       
> 2      1
> 1         2        6       2              2             2
> 24
>
> ...{More}...
>
>
>
>
>
>
>
> ########################################################################
> ######################################
>
>
>
>
>
>
>
>
>
> Robert Farley
>
> Metro
>
> 1 Gateway Plaza
>
> Mail Stop 99-23-7
>
> Los Angeles, CA 90012-2952
>
> Voice: (213)922-2532
>
> Fax:    (213)922-2868
>
> www.Metro.net
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list