[R] R and Matlab
Henrik Bengtsson
hb at stat.berkeley.edu
Sat Oct 30 02:10:49 CEST 2010
Hi.
On Fri, Oct 29, 2010 at 3:31 AM, Claudia Beleites <cbeleites at units.it> wrote:
> Dear Henrik,
>
> sorry for bothering you with a report hastily pasted together and not
> particularly nice for you as I used my toy data flu from a non-standard
> package. I should have better used e.g. the iris.
>
> I'm aware that writeMat doesn't deal with S4 objects. In fact, if I'd
> overlook the error message, there's the 2nd chance to see that the file size
> is 0B.
Yes, it is an unfortunate side effect that if there is an error while
writing the MAT file, it gives a corrupt *.mat file. I'll put it on
the "to do in the future" list to write to a temporary file (e.g.
*.mat.tmp) which is only renamed to *.mat when writeMat() returns
successfully. I use that "trick" elsewhere and it has saved us a few
times.
> In fact the attempt to save flu directly was a classical "autopilot" error,
> that's why I tried to save the x afterwards.
>
> So the problem here was the unnamed storing of x.
>
>> I intentionally do not try to "infer" the name "x" from
>> writeMat("flu.mat", x), basically because I think using substitute()
>> should be avoided as far as possible, but also because it is unclear
>> what the name should be in cases such as writeMat("flu.mat", 1:10).
>
> I was just going to suggest a patch that assigns the names of type Vnumber
> to the unnamed objects - but when I wanted to get the source I realized your
> version with the warning is already out.
>
> I think, however, you may forgot a nchar?: any (nchar (names) == 0)
Thanks for spotting that; a merge of (nchar (names) == 0) and (names
== "") incorrectly became (names == 0). Corrected in the next release.
>
> So here's my suggestion for l. 775-777 of writeMat.R:
>
> if (is.null(names) || any (nchar (names) == 0L)) {
> names [nchar (names) == 0L] <- paste ("V", which (nchar (names) == 0L),
> sep = "")
> names (args) <- names
> warning("All objects written have to be named, e.g. use writeMat(...,
> x=a, y=y) and not writeMat(..., x=a, y): ", deparse(sys.call()), "\nDummy
> names have been assigned.");
> }
I did think about that, however, it may introduce other ambiguities.
For instance, consider
writeMat("foo.mat", V2=1, 2);
Then there will be two "V2" names. The analogue to read.table() or
data.frame() is to add ".1" etc when there is a clash, e.g. "V2" and
"V2.1". However, "V2.1" is not a valid name in Matlab. What should
then be done? Of course, you can try to make sure you generate valid
Matlab names.
On a related matter, today you can do writeMat("foo.mat", V2=1,
"V2.1"=2) and there is no warning/error given by writeMat() and it
reads correctly by readMat(). However, in Matlab you get
>> load('foo.mat')
??? Error using ==> load
Invalid field name: 'V2.1'.
If anyone knows a regular expression for testing the validity of names
such that they are valid Matlab variable/field names, please let me
know and I can add additional sanity checks in writeMat().
Also, as your initial example indicates that it could be surprising
that writeMat("foo.mat", x) would become writeMat("foo.mat", V1=x) and
not writeMat("foo.mat", x=x).
After further investigation, I actually think that although Matlab
indeed can read non-named objects using data=load('foo.mat') I don't
think they are accessible. So I was wrong. Because of this, I have
bumped up the warning to be an error, preventing non-named objects to
be written. Will be the case in the next release of the package.
I will postpone adding any bells and whistles trying to make
writeMat() smart such as adding names. As soon as you do that you
introduce other issues and expectations and have to worry about
backward compatibilities if it turns out to be a bad idea. My
strategy for now is to have writeMat() assert that only valid MAT
files are written, and give errors otherwise.
/Henrik
>
>
> After all, e.g. data.frame () will also rather create dummy names for
> unnamed columns. And, I think, a warning should make the user aware that
> he's doing something that _may_ not work out as intendet. But here I think
> it is _most likely_ not working as intended.
>
>
>> MISCELLANEOUS:
>> Note that writeMat() cannot write compressed MAT files. It is
>> documented in help("readMat"), and will be so in help("writeMat") in
>> the next release. Package Rcompression, loaded or not, has no effect
>> on writeMat(). It is only readMat() that can read them, if
>> Rcompression is installed. You do not have to load it
>> explicitly/yourself - if readMat() detects a compress MAT file, it
>> will automatically try to load it;
>
> OK, good to know.
>
> Thanks a lot for your explanation in spite of my bad report.
>
> Claudia
>
>
> --
> Claudia Beleites
> Dipartimento dei Materiali e delle Risorse Naturali
> Università degli Studi di Trieste
> Via Alfonso Valerio 6/a
> I-34127 Trieste
>
> phone: +39 0 40 5 58-37 68
> email: cbeleites at units.it
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list