[R] Mystery Error in midnightStandard
Yohan Chalabi
chalabi at phys.ethz.ch
Wed Jan 28 16:28:28 CET 2009
>>>> "TB" == Ted Byers <r.ted.byers at gmail.com>
>>>> on Wed, 28 Jan 2009 09:30:58 -0500
TB> It is certain that all entries have the same format, but I'm
TB> starting to
TB> think that the error message is something of a red herring.
TB> Consider this:
TB>
TB> > year = 2009
TB> > week = 0
TB> > day = 3
TB> > datestr = sprintf(%i-%i-%i,year,week,day);datestr
TB> [1] 2009-0-3
TB> > date1 = timeDate(datestr, format = %Y-%U-%w);
TB> > date1
TB> GMT
TB> [1] [NA]
TB> > day = 4
TB> > datestr = sprintf(%i-%i-%i,year,week,day);datestr
TB> [1] 2009-0-4
TB> > date1 = timeDate(datestr, format = %Y-%U-%w);
TB> > date1
TB> GMT
TB> [1] [2009-01-01]
TB> >
TB> > datestr = sprintf(%i-%i-%i,year,week,3);datestr
TB> [1] 2009-0-3
TB> > date2 = timeDate(datestr, format = %Y-%U-%w);date2
TB> GMT
TB> [1] [NA]
TB> > difftimeDate(date2,date1, units = weeks)
TB> Error in midnightStandard(charvec, format) :
TB> 'charvec' has non-NA entries of different number of characters
TB> In addition: Warning messages:
TB> 1: In min(x) : no non-missing arguments to min; returning Inf
TB> 2: In max(x) : no non-missing arguments to max; returning -Inf
TB>
TB>
TB>
TB> The first values for year, week and day are the values on
TB> which my loop
TB> dies. It returns 'NA' here. It seems clear that it is
TB> returning NA because
TB> the date that data corresponds to is 2008-12-31.
TB>
TB> The error is being produced by difftimeDate rather than timeDate
TB> (as shown
TB> by the above session). But that represents a flaw in the
TB> function design.
This is not a flaw in timeDate. it behaves the same way as
'as.POSIXct'
strptime(datestr, format = "%Y-%U-%w")
Instead of claiming that there is a flaw in the function you could have
suggested an 'is.na' method for 'timeDate'.
I will add an 'is.na' method in the dev version of 'timeDate'.
regards,
Yohan
TB> It should fail when taking the elapsed time between a null
TB> and the present,
TB> but if I wrote such a function, I'd have it return null
TB> (perhaps with a
TB> warning) rather than just die.
TB>
TB> A bigger issue is that timeDate ought never give null here
TB> (which is what I
TB> assume 'NA' means), since all the data comes from transaction
TB> data with real
TB> dates, so the elapsed time, measured in weeks, ought to always
TB> be a valid
TB> real number that is positive semidefinite. I have not yet
TB> come to any
TB> conclusions as to how it ought to behave (whether to return
TB> new years day,
TB> along with a warning, or to return the date requested by
TB> reinvoking itself
TB> with the year and week adjusted so a valid date is returned).
TB>
TB> On a practical side, how would I test date2 to see if it is
TB> null, so I can
TB> give it a sensible default value?
TB>
TB> A more troubling thought is that with this handling of dates
TB> in this
TB> combination of SQL (my group by clause uses
TB> YEAR(transaction_date),WEEK(transaction_date)) to get the data
TB> and R to
TB> process it, the week containing new years day will ALWAYS be
TB> split in two at
TB> the first second of the new year. I'm going to have to either
TB> figure out a
TB> way to correct this, or ignore it (as it doesn't actually make
TB> things wrong,
TB> but rather it splits a sample into two unequal parts).
--
PhD student
Swiss Federal Institute of Technology
Zurich
www.ethz.ch
More information about the R-help
mailing list