[R] ddply function nesting problems

baptiste auguie baptiste.auguie at googlemail.com
Thu Nov 19 16:24:29 CET 2009


Hi,

I think your ddply call with a calculation inside ".(  )" is the
problem. Are you sure you need to do this? Performing the cut outside
ddply seems to work fine,

determine_counts<-function()
{

        min_range<-1
        max_range<-30
        bin_range_size<-5
        Me_df<-data.frame(Data = c(1:15), Person = "Me")
        You_df<-data.frame(Data = c(10:20), Person = "You")
        Them_df<-data.frame(Data = c(15:25), Person = "Them")
        Group_df_tmp<-rbind(Me_df,You_df)
        Group_df<-rbind(Group_df_tmp,Them_df)
        Group_df$Person <- factor(Group_df$Person, levels = c("Them",
"You", "Me"))

        Group_df <- transform(Group_df, cut=cut(Data,
                                          breaks=fullseq(range(c(Data,
min_range, max_range)),
                                            bin_range_size)))

        counts <- ddply(Group_df, .(cut, Person), nrow)


        names(counts) <- c("Bin", "Person", "Frequency")
        qplot(Person, Frequency, data = counts,
              fill = Person, geom="bar", stat="identity", width = 0.9,
xlab="Person") +
                facet_grid(. ~ Bin)
}

 function_nesting()

HTH,

baptiste


2009/11/19 Jason Rupert <jasonkrupert at yahoo.com>:
> While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.  I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.  (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html)
>
> Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running:
> function_nesting()
>
> Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function.
>
> Thank you for any advice about how to proceed forward.
>
>
> determine_counts<-function()
> {
>
>         min_range<-1
>         max_range<-30
>         bin_range_size<-5
>         Me_df<-data.frame(Data = c(1:15), Person = "Me")
>         You_df<-data.frame(Data = c(10:20), Person = "You")
>         Them_df<-data.frame(Data = c(15:25), Person = "Them")
>         Group_df_tmp<-rbind(Me_df,You_df)
>         Group_df<-rbind(Group_df_tmp,Them_df)
>         Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me"))
>         #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow)
>
>          # Approach 1
>         counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow)
>
>         # Approach 2
>         range_tmp<-range(c(Group_df$Data, min_range, max_range))
>         counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow)
>
>
>         names(counts) <- c("Bin", "Person", "Frequency")
>         qplot(Person, Frequency, data = counts, fill = Person, geom="bar", stat="identity", width = 0.9, xlab="Person") +  facet_grid(. ~ Bin)
> }
>
>
>
>
> function_nesting<-function()
> {
>         determine_counts()
> }
>
>
>
> However, if the code is just run straight through without being nested it works fine:
>
>         min_range<-1
>         max_range<-30
>         bin_range_size<-5
>         Me_df<-data.frame(Data = c(1:15), Person = "Me")
>         You_df<-data.frame(Data = c(10:20), Person = "You")
>         Them_df<-data.frame(Data = c(15:25), Person = "Them")
>         Group_df_tmp<-rbind(Me_df,You_df)
>         Group_df<-rbind(Group_df_tmp,Them_df)
>         Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me"))
>         #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow)
>         counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow)
>
> Unfortunately this is not within a function, so thanks again for any advice on how to approach this issue.
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list