[R] Does SQL group by have a heavy duty equivalent in R
Farrel Buchinsky
fjbuch at gmail.com
Sun Dec 31 22:16:55 CET 2006
I converted the whole data frame to character by using
as.matrix
And then using a posting that explained how to get the naming conventions
back (which had been lost when converting to matrix)
Anything that I did not list with the id's it insisted in including them
with the measured variables. In other words it would not let me drop.
despite
melted<-melt(BigDF, id=c("SAMPLE_ID","ASSAY_ID"),
measured=c("GENOTYPE_ID","DESCRIPTION"))
unique(melted$variable)
[1] CUSTOMER PROJECT PLATE EXPERIMENT CHIP
WELL_POSITION GENOTYPE_ID DESCRIPTION ENTRY_OPERATOR
[10] INTERACT PLATEc
Levels: CUSTOMER PROJECT PLATE EXPERIMENT CHIP WELL_POSITION GENOTYPE_ID
DESCRIPTION ENTRY_OPERATOR INTERACT PLATEc
I should have only got GENOTYPE_ID and DESCRIPTION
"hadley wickham" <h.wickham at gmail.com> wrote in message
news:f8e6ff050612310758p11f96c0dl256ac5b15d11dc2c at mail.gmail.com...
>> nr.attempts
>> <-aggregate(RawSeq$GENOTYPE_ID,list(sample=RawSeq$SAMPLE_ID,assay=RawSeq$ASSAY_ID),length)
>> This was simply to figure out how many times the same piece of
>> information
>> had been obtained. I ran out of patience. It took beyond forever and
>> tapply
>> did not perform much better. The reshape package did not help - it
>> implied
>> one was out of luck if the data was not numeric. All of my data is
>> character
>> or factor.
>
> The reshape package will work if all your data is numeric, or all of
> it is character - it doesn't work with a mix. I will try and make
> this more clear in the documentation.
> However, depending on the size and structure of your data it may not
> be any faster than tapply or aggregate.
>
> Hadley
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list