[R] data frame manipulation with zero rows
arnaud Gaboury
arnaud.gaboury at gmail.com
Tue Jun 1 17:01:56 CEST 2010
It is indeed ddply() from package plyr.
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Tuesday, June 01, 2010 12:24 PM
> To: Peter Ehlers
> Cc: arnaud Gaboury; r-help at r-project.org
> Subject: Re: [R] data frame manipulation with zero rows
>
> On Tue, 1 Jun 2010, Peter Ehlers wrote:
>
> > On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> Brian,
> >>
> >> If I do understand correctly, I must use in my function something
> else than
> >> ddply() if I want to avoid any error each time my df has zero rows?
> >> Am I correct?
> >>
> >
> > You could define a function to handle the zero-rows case:
> >
> > f <- function(x){
> > if(nrow(x) < 1) out <- x[, c(1,3,2)] # or whatever
> > else
> > out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> > POSITION=sum(QUANTITY))[,c(1,3,2)]
> > out
> > }
> > f(futures)
>
> Or simply fix ddply. We don't know what that is or what it should do
> for the case of zero rows: it may or may not be the one in package
> plyr.
>
> >
> > -Peter Ehlers
> >
> >>
> >>
> >>> -----Original Message-----
> >>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> >>> Sent: Tuesday, June 01, 2010 9:47 AM
> >>> To: arnaud Gaboury
> >>> Subject: Re: [R] data frame manipulation with zero rows
> >>>
> >>> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >>>
> >>>> Dear group,
> >>>>
> >>>> Here is the kind of data.frame I obtain every day with my function
> :
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >>>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> >>>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> >>>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >>>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >>>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >>>> QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> >>>> c("373.2500",
> >>>> "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >>>> "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> "14.9200"
> >>>> )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >>>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >>>>
> >>>> I need then to apply to the df this following code line :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>>
> >>>> It works perfectly in most of case, BUT I have a new problem: it
> can
> >>>> sometime occurs that my df "futures" is empty, with zero rows.
> >>>>
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >>>> structure(numeric(0), class = "Date"),
> >>>> QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >>>> c("DESCRIPTION",
> >>>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> >>> class =
> >>>> "data.frame")
> >>>>
> >>>> It is not the usual case, but it can happen. With this df, when I
> >>> pass the
> >>>> above mentione line, I get an error :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>> Error in tapply(1:nrow(data), splitv, list) :
> >>>> arguments must have same length
> >>>>
> >>>>
> >>>> How can I avoid this when my df is empty?
> >>>
> >>> Ask the author of the (missing) function ddply() to correct the
> error
> >>> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >>>
> >>> It's helpful to give example code, but much more helpful if you
> test
> >>> it: yours cannot work without the function ddply() -- this is what
> >>> 'self-contained' means in the footer here.
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list