[R] diff-ing .rds files

Will Hopper wjhopper510 at gmail.com
Mon Nov 23 21:18:57 CET 2015


Thanks for the input Jeff and Burt, and I could have been more clear about
what I was looking for.

You are of course right that there is nothing preventing me from using
plain text files, I initially went with compressed binary files because
they were, of course, smaller.

What I was curious about is if there was any existing
program/script/function/tool that converted an rds to plain text, so that I
could point git at it when creating the change set for a commit. For
example, there are programs that convert pdf documents to plain text, and
you can configure git to use these programs to convert the PDF's to plain
text before comparing the current version with the version in the last
commit, and then only add the things that have changes to the new commit.
This allows you to the track changes to a pdf file the same way you would a
text file, and your delta's don't blow up by committing the entire PDF file
each time you change part of the file.

It would be simple enough to write something like this with some
combination of a shell script and an R script to do this same thing for
.rds files, but I wondered if there was something out there already I was
overlooking.

I do see the point that I may be trying to compensate for a flawed premise,
perhaps I should not use a .rds file here if I'm concerned about the repo
getting too big as time goes on. I was just hoping to have my cake and eat
it too.

- Will

On Mon, Nov 23, 2015 at 1:33 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
wrote:

> You are sending contradictory signals... do you or do you not want plain
> text files? You have not indicated what advantage you are getting from
> having binary files in your repository. By far the best answer I see to
> your dilemma is to save in ASCII format instead of the default binary
> format.
>
> On November 23, 2015 9:36:21 AM PST, Will Hopper <wjhopper510 at gmail.com>
> wrote:
>
>> Hi all,
>>
>> I'm posting to see if anyone knows of any existing resources that
>> auto-magically converts r objects in saved in .rds files to a plain text
>> representation, suitable for diffing?
>>
>> I often save the results of long running calculation as .rds files, and
>> since I use git for source control, it would be nice if there were a way to
>> convert rds files to a plain text representation for diffing, so I could
>> avoid having large commits full of binary data. Git allows you specify a
>> programs for binary --> text conversion for any file type, effectively
>> teaching git how to diff binary files. If there was something out there
>> developed to do this with .rds files, it would really like to know about it!
>>
>> I realize I could save the rds file with ascii=TRUE and compress=FALSE, but
>> that kind of defeats the point of saving as .rds in the first place.
>>
>> If there is no tool out that there anyone
>> knows of, I don't think it would
>> be too hard for me to write something with bash + Rscript to get the job
>> done, but I'd like to avoid re-inventing the wheel if possible.
>>
>> Thanks for the help!
>>
>>  [[alternative HTML version deleted]]
>>
>> ------------------------------
>>
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list