[R] [OT] (slightly) - OpenOffice Calc and text files
marc_schwartz at me.com
Wed Oct 13 19:50:59 CEST 2010
On Oct 13, 2010, at 12:13 PM, Schwab,Wilhelm K wrote:
> Hello all,
> I had a very strange looking problem that turned out to be due to unexpected (by me at least) format changes to one of my data files. We have a small lab study in which each run is represented by a row in a tab-delimited file; each row identifies a repetition of the experiment and associates it with some subjective measurements and times from our notes that get used to index another file with lots of automatically collected data. In short, nothing shocking.
> In a moment of weakness, I opened the file using (I think it's version 3.2) of OpenOffice Calc to edit something that I had mangled when I first entered it, saved it (apparently the mistake), and reran my analysis code. The results were goofy, and the problem was in my code that runs before R ever sees the data. That code was confused by things that I would like to ensure don't happen again, and I suspect that some of you might have thoughts on it.
> The problems specifically:
> (1) OO seems to be a little stingy about producing tab-delimited text; there is stuff online about using the csv and editing the filter and folks (presumably like us) saying that it deserves to be a separate option.
> (2) Dates that I had formatted as YYYY got chopped to YY (did we not learn anything last time?<g>) and times that I had formatted in 24 hours ended up AM/PM.
> Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc.
I don't use OpenOffice (soon to be LibreOffice) much these days, but one of the things that you can try, is when you go to save the file as a CSV and edit the filter, there is an option there "Save cell content as shown". If that is checked, then any cell formatting that has been applied, either by default or by your actions, will be retained in the exported data. If that is unchecked, then the 'raw' data is exported to the file. I just tried it here (on OSX) and with the option checked, the years were exported with the default two digits. The years were exported with four digits with the box unchecked.
Unfortunately, I had no joy with a time field. The AM/PM formatting was retained with the box checked or unchecked.
>From what I can tell from a quick search, these default formats are determined by the language/locale settings.
On Linux, a spreadsheet based alternative would be Gnumeric (http://projects.gnome.org/gnumeric/) and of course, there is always Emacs, which I have now used on Windows, Linux and OSX.
More information about the R-help