[Rd] [RFC] readtable enhancement

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Wed Mar 27 22:33:55 CET 2019


   Just to clarify/amplify: on the bug tracking system there's a
drop-down menu to specify severity, and "enhancement" is one of the
choices, so you don't have to worry that you're misrepresenting your
patch as fixing a bug.

  The fact that an R-core member (Michael Lawrence) thinks this is
worth looking at is very encouraging (and somewhat unusual for
feature/enhancement suggestions)!

  Ben Bolker

On Wed, Mar 27, 2019 at 5:29 PM Michael Lawrence via R-devel
<r-devel using r-project.org> wrote:
>
> This has some nice properties:
>
> 1) It self-documents the input expectations in a similar manner to
> colClasses.
> 2) The implementation could eventually "push down" the coercion, e.g.,
> calling it on each chunk of an iterative read operation.
>
> The implementation needs work though, and I'm not convinced that coercion
> failures should fallback gracefully to the default.
>
> Feature requests fall under a "bug" in bugzilla terminology, so please
> submit this there. I think I've made you an account.
>
> Thanks,
> Michael
>
> On Wed, Mar 27, 2019 at 1:19 PM Kurt Van Dijck <
> dev.kurt using vandijck-laurijssen.be> wrote:
>
> > Thank you for your answers.
> > I rather do not file a new bug, since what I coded isn't really a bug.
> >
> > The problem I (my colleagues) have today is very stupid:
> > We read .csv files with a lot of columns, of which most contain
> > date-time stamps, coded in DD/MM/YYYY HH:MM.
> > This is not exotic, but the base library's readtable (and derivatives)
> > only accept date-times in a limited number of possible formats (which I
> > understand very well).
> >
> > We could specify a format in a rather complicated format, for each
> > column individually, but this syntax is rather difficult to maintain.
> >
> > My solution to this specific problem became trivial, yet generic
> > extension to read.table.
> > Rather than relying on the built-in type detection, I added a parameter
> > to a function that will be called for each to-be-type-probed column so I
> > can overrule the built-in limited default.
> > If nothing returns from the function, the built-in default is still
> > used.
> >
> > This way, I could construct a type-probing function that is
> > straight-forward, not hard to code, and makes reading my .csv files
> > acceptible in terms of code (read.table parameters).
> >
> > I'm sure I'm not the only one dealing with such needs, escpecially
> > date-time formats exist in enormous amounts, but I want to stress here
> > that my approach is agnostic to my specific problem.
> >
> > For those asking to 'show me the code', I redirect to my 2nd patch,
> > where the tests have been extended with my specific problem.
> >
> > What are your opinions about this?
> >
> > Kind regards,
> > Kurt
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list