[R] splitting dataframe, assign to new dataframe, add new rows to new dataframe

Ista Zahn istazahn at gmail.com
Tue Oct 13 13:57:51 CEST 2009


I'm sure there's a really cool way to do this with plyr, although I
don't know if my particular plyr version is much better. Anyway here
it is:

cmbine <- read.csv(textConnection('names, mass, classes
apple,0.50,1
tiger,100.00,2
pencil,0.01,3
chicken,1.00,2
banana,0.15,1
pear,0.30,1'))

library(plyr)

dfl <- list()

for(i in 1:max(cmbine$classes)) {
  dfl[[i]] <- ddply(cmbine, .(classes), function(x) {x[i,]})
}

dfl

Hope it helps,
Ista

On Mon, Oct 12, 2009 at 10:02 PM, cls59 <chuck at sharpsteen.net> wrote:
>
>
>
> wk yeo wrote:
>>
>>
>> Hi, all,
>>
>> My objective is to split a dataframe named "cmbine" according to the value
>> of "classes". After the split, I will take the first instance from each
>> class and bin them into a new dataframe, "df1". In the 2nd iteration, I
>> will take the 2nd available instance and bin them into another new
>> dataframe, "df2".
>>
>>
>
> My apologies, I did not read the first lines of your question carefully. Say
> we split the data frame by class using by():
>
> byClass <- by( cmbine, cmbine[['classes']], function( df ){ return(df) } )
>
>
> We could then determine the maximum number of rows in all the returned data
> frames:
>
> maxRows <- max(sapply( byClass, nrow ))
>
>
> Then, I usually resort to a gratuitous application of lapply() and
> do.call():
>
> # Loop over each value between 1 and the maximum number of rows, return
> results as a list.
> lapply( 1:maxRow, function(i){
>
>        # Loop over each data frame, extract the ith rows and rbind the
> results
>        # together.
>        ithRows <- do.call(rbind,lapply(byClass,function(df){
>
>          return( df[i,] )
>
>        }))
>
>        # Remove all NA rows
>        ithRows <- ithRows[ !is.na(ithRows[,1]), ]
>
>        return(ithRows)
>
> })
>
>
> [[1]]
>   names  mass classes
> 1  apple 5e-01       1
> 2  tiger 1e+02       2
> 3 pencil 1e-02       3
>
> [[2]]
>    names mass classes
> 1  banana 0.15       1
> 2 chicken 1.00       2
>
> [[3]]
>  names mass classes
> 1  pear  0.3       1
>
>
> There's definitely a more elegant way to do this, perhaps using some
> routines in the plyr package.
>
> Good luck!
>
> -Charlie
>
> -----
> Charlie Sharpsteen
> Undergraduate
> Environmental Resources Engineering
> Humboldt State University
> --
> View this message in context: http://www.nabble.com/splitting-dataframe%2C-assign-to-new-dataframe%2C-add-new-rows-to-new-dataframe-tp25865409p25866082.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org




More information about the R-help mailing list