[R] Building factors across two columns, is this possible?
David Winsemius
dwinsemius at comcast.net
Sat Nov 24 18:35:50 CET 2012
On Nov 23, 2012, at 8:42 PM, Brian Feeny wrote:
>
> I am trying to make it so two columns with similar data use the same
> internal numbers for same factors, here is the example:
>
>> read.csv("test.csv",header =FALSE,sep=",")
> V1 V2 V3
> 1 sun moon stars
> 2 stars moon sun
> 3 cat dog catdog
> 4 dog moon sun
> 5 bird plane superman
> 6 1000 dog 2000
>> data <- read.csv("test.csv",header =FALSE,sep=",")
>> str(data)
> 'data.frame': 6 obs. of 3 variables:
> $ V1: Factor w/ 6 levels "1000","bird",..: 6 5 3 4 2 1
> $ V2: Factor w/ 3 levels "dog","moon","plane": 2 2 1 2 3 1
> $ V3: Factor w/ 5 levels "2000","catdog",..: 3 4 2 4 5 1
>
>> as.numeric(data$V1)
> [1] 6 5 3 4 2 1
>> as.numeric(data$V2)
> [1] 2 2 1 2 3 1
>> as.factor(data$V1)
> [1] sun stars cat dog bird 1000
> Levels: 1000 bird cat dog stars sun
>> as.factor(data$V2)
> [1] moon moon dog moon plane dog
> Levels: dog moon plane
>
>
> So notice "dog" is 4 in V1, yet its 1 in V2. Is there a way, either
> on import, or after, to have factors computed for both columns and
> assigned
> the same internal values?
> dat[] <- lapply(dat, function(x) factor(as.character(x),
levels=
levels(unlist(dat)) ) )
> dat
V1 V2 V3
1 sun moon stars
2 stars moon sun
3 cat dog catdog
4 dog moon sun
5 bird plane superman
6 1000 dog 2000
> levels(dat[[1]])
[1] "1000" "bird" "cat" "dog" "stars" "sun"
[7] "moon" "plane" "2000" "catdog" "superman"
I see your "clarification". Reordering the representation can be done
with :
levels(dat) <- <character vector>
--
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list