[Rd] [RFC] readtable enhancement
Kurt Van Dijck
dev@kurt @end|ng |rom v@nd|jck-|@ur|j@@en@be
Thu Mar 28 06:33:18 CET 2019
Hey,
In the meantime, I submitted a bug. Thanks for the assistence on that.
> and I'm not convinced that
> coercion failures should fallback gracefully to the default.
the gracefull fallback:
- makes the code more complex
+ keeps colConvert implementations limited
+ requires the user to only implement what changed from the default
+ seemed to me to smallest overall effort
In my opinion, gracefull fallback makes the thing better,
but without it, the colConvert parameter remains usefull, it would still
fill a gap.
> The implementation needs work though,
Other than to remove the gracefull fallback?
Kind regards,
Kurt
On wo, 27 mrt 2019 14:28:25 -0700, Michael Lawrence wrote:
> This has some nice properties:
> 1) It self-documents the input expectations in a similar manner to
> colClasses.
> 2) The implementation could eventually "push down" the coercion, e.g.,
> calling it on each chunk of an iterative read operation.
> The implementation needs work though, and I'm not convinced that
> coercion failures should fallback gracefully to the default.
> Feature requests fall under a "bug" in bugzilla terminology, so please
> submit this there. I think I've made you an account.
> Thanks,
> Michael
>
> On Wed, Mar 27, 2019 at 1:19 PM Kurt Van Dijck
> <[1]dev.kurt using vandijck-laurijssen.be> wrote:
>
> Thank you for your answers.
> I rather do not file a new bug, since what I coded isn't really a
> bug.
> The problem I (my colleagues) have today is very stupid:
> We read .csv files with a lot of columns, of which most contain
> date-time stamps, coded in DD/MM/YYYY HH:MM.
> This is not exotic, but the base library's readtable (and
> derivatives)
> only accept date-times in a limited number of possible formats
> (which I
> understand very well).
> We could specify a format in a rather complicated format, for each
> column individually, but this syntax is rather difficult to
> maintain.
> My solution to this specific problem became trivial, yet generic
> extension to read.table.
> Rather than relying on the built-in type detection, I added a
> parameter
> to a function that will be called for each to-be-type-probed column
> so I
> can overrule the built-in limited default.
> If nothing returns from the function, the built-in default is still
> used.
> This way, I could construct a type-probing function that is
> straight-forward, not hard to code, and makes reading my .csv files
> acceptible in terms of code (read.table parameters).
> I'm sure I'm not the only one dealing with such needs, escpecially
> date-time formats exist in enormous amounts, but I want to stress
> here
> that my approach is agnostic to my specific problem.
> For those asking to 'show me the code', I redirect to my 2nd patch,
> where the tests have been extended with my specific problem.
> What are your opinions about this?
> Kind regards,
> Kurt
More information about the R-devel
mailing list