[R] How do I read a text (.csv) file to match a matrix/cross tab? (Object confusion??)
Kellie Wills
kellie at revolution-computing.com
Thu Nov 6 01:41:20 CET 2008
read.table doesn't realize the first column should be row names. Try
read.table("C:/Data/R/NewTarget.csv", header=TRUE, sep=",",
na.strings="NA", dec=".", row.names="tripid_nu")
Kellie Wills
Engineering Service Manager
REvolution Computing
kellie at revolution-computing.com
On Nov 5, 2008, at 3:51 PM, Farley, Robert wrote:
> I'm having a problem reading data to set control totals for a
> dataframe.
> I want to adjust a dataframe based on a 2-d table of values, which I
> get
> by using :
>
>> CurrentX1Sums <- as.matrix(xtabs(~tripid_nu+lineon, data=SurveyData))
>
>
>> CurrentX2Sums <- apply(CurrentX1Sums, 1, sum)
>
>
>
> I've created a .csv file with new (target) sums that looks like this:
>
> tripid_nu Warner Center De Soto Pierce College Tampa
> Reseda Balboa Woodley Sepulveda Van Nuys
> Woodman
> Valley College Laurel Canyon North Hollywood
>
> 9011880 5 2 2 2 2 2 2
> 2 2 2 6 4 1
>
> 9011890 1 1 1 1 1 1 2
> 1 1 1 1 2 1
>
> 9011960 2 2 2 1 2 2 1
> 2 3 2 2 1 1
>
> 9011970 1 1 1 1 2 1 1
> 2 6 2 2 2 24
>
> 9012040 2 2 2 3 2 7 2
> 2 2 1 1 1 1
>
> 9012050 1 1 1 1 1 1 1
> 2 2 2 1 1 5
>
> ...{More}...
>
>
>
> I'm trying to read/process it like this:
>
>> NewTargetData <- read.table("C:/Data/R/NewTarget.csv", header=TRUE,
> sep=",", na.strings="NA", dec=".")
>
>> NewTargetX1Sums <- as.matrix(NewTargetData)
>
>> NewTargetX2Sums <- apply(NewTargetX1Sums, 1, sum)
>
>
>
>
>
> The structures of CurrentX1Sums and NewTargetX1Sums are different:
>
>
>
>> str(CurrentX1Sums)
>
> xtabs [1:55, 1:13] 1 0 1 0 1 0 0 0 0 1 ...
>
> - attr(*, "dimnames")=List of 2
>
> ..$ tripid_nu: chr [1:55] "9011880" "9011890" "9011960" "9011970" ...
>
> ..$ lineon : chr [1:13] "Warner Center" "De Soto" "Pierce College"
> "Tampa" ...
>
> - attr(*, "class")= chr [1:2] "xtabs" "table"
>
> - attr(*, "call")= language xtabs(formula = ~tripid_nu + lineon,
> data =
> SurveyData)
>
>
>
>> str(NewTargetX1Sums)
>
> int [1:55, 1:14] 9011880 9011890 9011960 9011970 9012040 9012050
> 9012130 9012280 9012290 9012720 ...
>
> - attr(*, "dimnames")=List of 2
>
> ..$ : NULL
>
> ..$ : chr [1:14] "tripid_nu" "Warner.Center" "De.Soto"
> "Pierce.College" ...
>
>>
>
>
>
>
>
>
>
>
>
> Question 1) The structures (CurrentX1Sums , NewTargetX1Sums) are
> different. One way is in the dimension of the rownames. Instead of
> line numbers, I want tripid_nu. How do I do that? What's the
> appropriate "structure" for both?
>
>
>
>
>
> Question 2) Why do the labels in NewTargetData have dots in place of
> spaces? Will that be a problem later when I try to match them with
> SurveyData?
>
>
>
>
>
> Question 3) Ultimately, I want to create a variable in the original
> dataframe like:
>
> SurveyData$NewX1 = TargetX1Sums/ CurrentX1Sums { for each
> tripid_nu, lineon combination}
>
> Am I on the right track to do so? Any hints on what THAT syntax will
> look like?
>
>
>
>
>
>
>
>
>
> Thanks in advance,
>
>
>
>
>
>
>
>
>
>
>
> ########################################################################
> ######################################
>
> #My work to date:
>
>> SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
> use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
>
>> NewTargetData <- read.table("C:/Data/R/NewTarget.csv", header=TRUE,
> sep=",", na.strings="NA", dec=".")
>
>>
> #-----------------------------------------------------------------------
> --------
>
>> temp <- sub(' +$', '', SurveyData$direction_) # Remove spaces
> from variable names
>
>> SurveyData$direction_ <- temp
>
>>
> #-----------------------------------------------------------------------
> --------
>
>> SurveyData$StnNum=as.numeric(SurveyData$lineon)
>
>> CurrentX1Sums <- as.matrix(xtabs(~tripid_nu+lineon, data=SurveyData))
>
>
>> CurrentX2Sums <- apply(CurrentX1Sums, 1, sum)
>
>> NewTargetX1Sums <- as.matrix(NewTargetData)
>
>> NewTargetX2Sums <- apply(NewTargetX1Sums, 1, sum)
>
>>
>
>> str(CurrentX1Sums)
>
> xtabs [1:55, 1:13] 1 0 1 0 1 0 0 0 0 1 ...
>
> - attr(*, "dimnames")=List of 2
>
> ..$ tripid_nu: chr [1:55] "9011880" "9011890" "9011960" "9011970" ...
>
> ..$ lineon : chr [1:13] "Warner Center" "De Soto" "Pierce College"
> "Tampa" ...
>
> - attr(*, "class")= chr [1:2] "xtabs" "table"
>
> - attr(*, "call")= language xtabs(formula = ~tripid_nu + lineon,
> data =
> SurveyData)
>
>> str(NewTargetX1Sums)
>
> int [1:55, 1:14] 9011880 9011890 9011960 9011970 9012040 9012050
> 9012130 9012280 9012290 9012720 ...
>
> - attr(*, "dimnames")=List of 2
>
> ..$ : NULL
>
> ..$ : chr [1:14] "tripid_nu" "Warner.Center" "De.Soto"
> "Pierce.College" ...
>
>>
>
>
>
>> CurrentX1Sums
>
> lineon
>
> tripid_nu Warner Center De Soto Pierce College Tampa Reseda Balboa
> Woodley Sepulveda Van Nuys Woodman Valley College Laurel Canyon North
> Hollywood
>
> 9011880 1 0 2 1 0 2
> 1 0 0 0 1 0
> 0
>
> 9011890 0 0 0 0 0 0
> 1 0 0 0 0 1
> 0
>
> 9011960 1 1 2 0 1 1
> 0 1 3 2 1 0
> 0
>
> 9011970 0 0 0 0 1 0
> 0 1 6 1 1 1
> 14
>
> ...{More}...
>
>> NewTargetX1Sums
>
> tripid_nu Warner.Center De.Soto Pierce.College Tampa Reseda
> Balboa
> Woodley Sepulveda Van.Nuys Woodman Valley.College Laurel.Canyon
> North.Hollywood
>
> [1,] 9011880 5 2 2 2
> 2 2
> 2 2 2 2 6 4
> 1
>
> [2,] 9011890 1 1 1 1
> 1 1
> 2 1 1 1 1 2
> 1
>
> [3,] 9011960 2 2 2 1
> 2 2
> 1 2 3 2 2 1
> 1
>
> [4,] 9011970 1 1 1 1
> 2 1
> 1 2 6 2 2 2
> 24
>
> ...{More}...
>
>
>
>
>
>
>
> ########################################################################
> ######################################
>
>
>
>
>
>
>
>
>
> Robert Farley
>
> Metro
>
> 1 Gateway Plaza
>
> Mail Stop 99-23-7
>
> Los Angeles, CA 90012-2952
>
> Voice: (213)922-2532
>
> Fax: (213)922-2868
>
> www.Metro.net
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list