[R] question: data.frame data conversion
Rui Barradas
ruipbarradas at sapo.pt
Sun Aug 4 21:47:43 CEST 2013
Hello,
You're insisting in as.data.frame(cbind(...)). Don't do that. Just see
the difference:
z = data.frame(x, y)
str(z)
'data.frame': 8 obs. of 2 variables:
$ x: Factor w/ 3 levels "a","b","c": 1 1 1 2 2 2 3 3
$ y: num 1 1.2 1.1 1.01 1.03 1 2 3
z2 = as.data.frame(cbind(x,y))
str(z2)
'data.frame': 8 obs. of 2 variables:
$ x: Factor w/ 3 levels "a","b","c": 1 1 1 2 2 2 3 3
$ y: Factor w/ 7 levels "1","1.01","1.03",..: 1 5 4 2 3 1 6 7
What happens is that cbind creates a matrix from x and y, converting all
to character, and y is no longer numeric. Then as.data.frame converts
the strings to factors, the default behavior.
As for your question, change fun() to the following.
fun <- function(z){
zs <- split(z, x)
n <- length(zs)
m <- sapply(zs, nrow)
id <- unlist(sapply(m, seq_len))
zz <- cbind(id, z)
dcast(zz, id ~ x)[-1]
}
Hope this helps,
Rui Barradas
Em 04-08-2013 20:34, Brijesh Gulati escreveu:
> Hello Rui: Thanks for the solution. It does work to the specification. Just
> one follow-up. I get an error if the number of repeating values are
> different. For instance, in the following example, "c" is repeated only 2
> times; whereas "a" and "b" three times. I am fine with the output shows NA
> for the missing values. Any help would be greatly appreciated.
>
> x = c("a","a", "a", "b","b","b", "c", "c")
> y = c(1.0, 1.2, 1.1, 1.01, 1.03, 1.0, 2.0, 3.0)
> z = as.data.frame(cbind(x,y))
>
> x y
> 1 a 1
> 2 a 1.2
> 3 a 1.1
> 4 b 1.01
> 5 b 1.03
> 6 b 1
> 7 c 2
> 8 c 3
>
> -----Original Message-----
> From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
> Sent: Sunday, August 04, 2013 1:57 PM
> To: Brijesh Gulati
> Cc: r-help at r-project.org
> Subject: Re: [R] question: data.frame data conversion
>
> Hello,
>
> First of all, do _not_ create a data frame with
>
> as.data.frame(cbind(...))
>
> Instead, use
>
> z = data.frame(x, y)
>
> As for your question, try the following.
>
>
> library(reshape2)
> fun <- function(z){
> zs <- split(z, x)
> n <- length(zs)
> m <- nrow(zs[[1]])
> zz <- cbind(id = rep(1:m, n), z)
> dcast(zz, id ~ x)[-1]
> }
>
> fun(z)
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 04-08-2013 13:49, Brijesh Gulati escreveu:
>> Hello, I have a data.frame with repeating rows and corresponding
>> value. For instance, "z" will be an example of that.
>>
>>
>>
>> x = c("a","a", "a", "b","b","b")
>>
>> y = c(1.0, 1.2, 1.1, 1.01, 1.03, 1.0)
>>
>> z = as.data.frame(cbind(x,y))
>>
>>
>>
>>> z
>>
>> x y
>>
>> 1 a 1
>>
>> 2 a 1.2
>>
>> 3 a 1.1
>>
>> 4 b 1.01
>>
>> 5 b 1.03
>>
>> 6 b 1
>>
>>
>>
>> So, you see that "a" and "b" are repeated 3 times and have three
>> different value. I would like to convert this data into something like the
> following.
>>
>>
>>
>> a b
>> 1.0 1.01
>> 1.2 1.03
>> 1.1 1.00
>>
>>
>>
>> In the above, repeating rows (a,b) become columns and their values
>> show up in their respective column.
>>
>> Finally, to clarify few things. The number of rows of each repeating
>> item (a or b) would be the same and hence, the number of row expected
>> in the output shall be the same.
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list