[R] Reading large files with R

Martin Møller Skarbiniks Pedersen tr@xp|@yer @end|ng |rom gm@||@com
Sun Sep 1 23:53:47 CEST 2019


On Sun, 1 Sep 2019 at 21:53, Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:

> On 01/09/2019 3:06 p.m., Martin Møller Skarbiniks Pedersen wrote:
> > Hi,
> >
> >    I am trying to read yaml-file which is not so large (7 GB) and I have
> > plenty of memory.
>
>

> Individual elements in character vectors have a size limit of 2^31-1.
> The read_yaml() function is putting the whole file into one element, and
> that's failing.
>
>
Oh. I didn't know that. But ok, why would anyone create a
a single character vector so big ...

You probably have a couple of choices:
>
>   - Rewrite read_yaml() so it doesn't try to do that.  This is likely
> hard, because most of the work is being done by a C routine, but it's
> conceivable you could use the stringi::stri_read_raw function to do the
> reading, and convince the C routine to handle the raw value instead of a
> character value.
>

I actually might do that in the future.

  - Find a way to split up your file into smaller pieces.
>

Yes, that will be my first solution. Most YAML is easier to parse without
pasting all lines together (crazy!)


> Duncan Murdoch
>

Thanks for pointing me in the right direction.

/Martin

	[[alternative HTML version deleted]]



More information about the R-help mailing list