[Rd] unsubscribe

Sun Feb 23 01:43:02 2003

From:r-devel-admin@stat.math.ethz.ch on 02/22/2003 12:00 PM CET

Sent by: r-devel-admin@stat.math.ethz.ch

Please respond to r-devel@stat.math.ethz.ch

To:   r-devel@stat.math.ethz.ch
cc:   (bcc: Asheka Rahman/arahma1/LSU)

Subject:    R-devel digest, Vol 1 #101 - 10 msgs

Send R-devel mailing list submissions to
r-devel@stat.math.ethz.ch

To subscribe or unsubscribe via the World Wide Web, visit
http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
or, via email, send a message with subject or body 'help' to
r-devel-request@stat.math.ethz.ch

You can reach the person managing the list at
r-devel-admin@stat.math.ethz.ch

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-devel digest..."

Today's Topics:

1. ks.test gets stuck (PR#2571) (dirk@santafe.edu)
2. Re: [R] Who to decide what a generic function should look like? (Duncan
Murdoch)
3. require vs library (ripley@stats.ox.ac.uk)
4. Perl question (Duncan Murdoch)
5. Re: Perl question (Dirk Eddelbuettel)
6. Re: Perl question (David Brahm)
7. Re: Perl question (Roger Peng)
8. Re: Perl question (Michael Na Li)
9. Re: POSIX problem in New Zealand (PR#2570) (Arni Magnusson)
10. RE: Re: [R] Who to decide what a generic function should look like?
(Henrik Bengtsson)

--__--__--

Message: 1
Date: Fri, 21 Feb 2003 14:51:39 +0100 (MET)
From: dirk@santafe.edu
To: r-devel@stat.math.ethz.ch
CC: R-bugs@biostat.ku.dk
Subject: [Rd] ks.test gets stuck (PR#2571)

Full_Name: Michael Lachmann
Version: 1.6.1
OS: linux
Submission from: (NULL) (194.95.185.57)

ks.test enters an endless loop with repeated data. (The test is not
designed for
such data, but it shouldn't get stuck...)
example:
ks.test(rep(1,3),rep(1,1))
never stops.

--__--__--

Message: 2
From: Duncan Murdoch <murdoch@stats.uwo.ca>
To: "Henrik Bengtsson" <hb@maths.lth.se>
Cc: <r-devel@stat.math.ethz.ch>
Date: Fri, 21 Feb 2003 11:15:14 -0500
Subject: [Rd] Re: [R] Who to decide what a generic function should look
like?

On Thu, 20 Feb 2003 13:05:44 +1100, you wrote in message
<000d01c2d884$98690fa0$7341a8c0@alpha.wehi.edu.au>:

>I am not sure if what I am asking below should be discussed under r-help
>or r-devel, so please feel free to move over to r-devel.

I've done that, I think it's a more r-devel kind of topic.

>For me a generic function should be fully generic in the sense that
>there are no requirements of arguments agreement (and therefore it
>should not be documented as a reply to Smyth's thread).

I don't agree.  A generic function has a meaning.  Often that meaning
is expressed in terms of certain arguments.  If a user of an unknown
object knows that it supports the generic, they have a right to expect
it to behave according to the standard meaning of the generic.

>My concern is that enforcing methods to match the argument signature of
>the generic function will make packages incompatible with each other. I
>can not create a generic function called "normalize" for my microarray
>package and expect it to work together with other package defining a
>generic function with the same name. Some short-term and long-term
>outcomes from this are:

That's only a short term problem.  As namespaces arrive, it will go
away.  Your normalize will not trample on anyone else's normalize,
because your names will live in a different namespace.  Hopefully the
default behaviour will be reasonable (i.e. if I say "normalize", and
only one version is around, I'll get it; if there are two, there'll be
either a clear way to choose, or a warning or error about the
ambiguity).

>  * who is the person to decide what a generic function should look
>like, and
>  * who owns the right to the method name "normalize"?

The author of the package makes the decisions and owns the names in
that package.

Duncan Murdoch

--__--__--

Message: 3
From: ripley@stats.ox.ac.uk
Date: Fri, 21 Feb 2003 18:06:41 +0000 (GMT)
To: R-devel@stat.math.ethz.ch
Subject: [Rd] require vs library

There seems to be a widespread assumption that the way for package foo to
require package bar is via `require(bar)'.  It isn't!

That returns a logical which is in the vast majority of cases unchecked.
So if the package is really required, the code will fail without a warning
if the package is unavailable.  You may as well call library() and let it
do the checking.  (In a few cases you can safely assume that the package
is present, e.g. nnet can require(MASS) since they are installed
together.)

I find it confusing if require(bar, quietly=TRUE) is used with no message.
If you are going to change the search path, please let the end user know
you have done so.  I've had nasty surprises more than once from this by
getting datasets from packages I did not ask to be there.

Another point: please do not call code with side effects like require,
library or options at the top level in foo/R/foo, but do so within
.First.lib.  This becomes important if the code is dumped or put in a
database.

--
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

--__--__--

Message: 4
From: Duncan Murdoch <murdoch@stats.uwo.ca>
To: r-devel@stat.math.ethz.ch
Date: Fri, 21 Feb 2003 14:43:21 -0500
Subject: [Rd] Perl question

I'm working on the installer code, but I don't know Perl well enough.
What is a good Perl equivalent of the R statement

if ( s %in% c('string1', 'string2', 'string3') )  ...

I can do it with

if (s == "string1" || s == "string2" || s == "string3")  ...

but I've got a feeling there's a better way to do it using associative
arrays.

The real example will have about a dozen fixed strings that I want to
search a few thousand times.

Duncan Murdoch

--__--__--

Message: 5
From: "Dirk Eddelbuettel" <edd@debian.org>
To: Duncan Murdoch <murdoch@stats.uwo.ca>, r-devel@stat.math.ethz.ch
Subject: Re: [Rd] Perl question
Date: Fri, 21 Feb 2003 14:01:41 -0600

> I'm working on the installer code, but I don't know Perl well enough.
> What is a good Perl equivalent of the R statement
>
>  if ( s %in% c('string1', 'string2', 'string3') )  ...
>
> I can do it with
>
>  if (s == "string1" || s == "string2" || s == "string3")  ...

You probably wanted 'eq' instead of '=='.  The reg.exp. version is

if (s =~ m/(string1|string2|string3)/o) {  ...

where the trailing o makes it supposedly less expensive (regexp
only compiled once, see man perlop).

> but I've got a feeling there's a better way to do it using associative
> arrays.

You could use grep and map on assoc. arrays, but IMHO the above is easier.

> The real example will have about a dozen fixed strings that I want to
> search a few thousand times.

That may be worth profiling / timing.

Dirk

--
According to the latest figures, 43% of all signatures are totally
worthless.

--__--__--

Message: 6
From: David Brahm  <brahm@alum.mit.edu>
Date: Fri, 21 Feb 2003 15:10:54 -0500
To: murdoch@stats.uwo.ca
Subject: Re: [Rd] Perl question
Cc: r-devel@stat.math.ethz.ch
Reply-To: brahm@alum.mit.edu

Duncan Murdoch wrote:
> What is a good Perl equivalent of the R statement
>  if ( s %in% c('string1', 'string2', 'string3') )  ...

Dirk Eddelbuettel <edd@debian.org> replied:
if (s =~ m/(string1|string2|string3)/o) {  ...
(I think Duncan and Dirk both meant "$s" instead of "s".)

I'd also suggest:
@mylist = ("string1","string2","string3");
if (grep /^$s$/, @mylist) {print "yes\n"}

(after all, There's More Than One Way To Do It.)
--
-- David Brahm (brahm@alum.mit.edu)

--__--__--

Message: 7
Date: Fri, 21 Feb 2003 12:15:44 -0800 (PST)
From: Roger Peng <rpeng@stat.ucla.edu>
To: Duncan Murdoch <murdoch@stats.uwo.ca>
cc: r-devel@stat.math.ethz.ch
Subject: Re: [Rd] Perl question

I sometimes hack something with a hash:

%keywords = ( "string1" => 1, "string2" => 1, "string3" => 1 )

and then do something like:

exists($keywords{$mystring})

which should be true if $mystring is one of "string1", "string2", or
"string3".

-roger
_______________________________
UCLA Department of Statistics
rpeng@stat.ucla.edu
http://www.stat.ucla.edu/~rpeng

On Fri, 21 Feb 2003, Duncan Murdoch wrote:

> I'm working on the installer code, but I don't know Perl well enough.
> What is a good Perl equivalent of the R statement
>
>  if ( s %in% c('string1', 'string2', 'string3') )  ...
>
> I can do it with
>
>  if (s == "string1" || s == "string2" || s == "string3")  ...
>
> but I've got a feeling there's a better way to do it using associative
> arrays.
>
> The real example will have about a dozen fixed strings that I want to
> search a few thousand times.
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>

--__--__--

Message: 8
To: r-devel@stat.math.ethz.ch
Subject: Re: [Rd] Perl question
From: Michael Na Li <lina@u.washington.edu>
Date: Fri, 21 Feb 2003 12:20:48 -0800

On Fri, 21 Feb 2003, Duncan Murdoch verbalised:

>  I'm working on the installer code, but I don't know Perl well enough.
>  What is a good Perl equivalent of the R statement
>
>   if ( s %in% c('string1', 'string2', 'string3') )  ...
>
>  I can do it with
>
>   if (s == "string1" || s == "string2" || s == "string3")  ...
>
>  but I've got a feeling there's a better way to do it using associative
>  arrays.
>

Using association arrays,

my @strings = ( 'string1', 'string2', 'string3' );
my %stringhash = ();
@stringhash{@strings} = ( 0..$#strings );
if (exists $stringhash{$s}) ..

I wonder if it might be more efficient using regular expression instead,

if ($s =~ /^string[1-3]$/) { ...

it will depends on how 'regular' the strings are, of course.

Michael

--__--__--

Message: 9
Date: Fri, 21 Feb 2003 21:06:33 -0800 (PST)
From: Arni Magnusson <arnima@u.washington.edu>
To: ripley@stats.ox.ac.uk
cc: r-devel@stat.math.ethz.ch, <R-bugs@biostat.ku.dk>
Subject: Re: [Rd] POSIX problem in New Zealand (PR#2570)

On Fri, 21 Feb 2003 ripley@stats.ox.ac.uk wrote:

> What exactly is the problem?  Those appear to be the same time, in
> different time zones.  Is the problem that you are in the N Hemisphere
> (your email address is) trying to use S Hemisphere times on an OS that
> does not support pre-1970 times?

Currently I'm in New Zealand working on a Windows XP machine that has
never left the country. I have not encountered problems with as.POSIXct
when working in other time zones.

> You seem to be confusing the time and how it is printed.  What do you
> want to do with these times?

I'm dealing with two kinds of problems: (1) what I see as a bug in
as.POSIXct, and (2) my own confusion with time zones. One problem at a
time:

(1) as.POSIXct

I would like the following to return similar times, except for the year,
but they don't:
> x <- as.POSIXct("1969-12-24")
> y <- as.POSIXct("1970-12-24")
> x
[1] "1969-12-23 23:00:00 New Zealand Standard Time"
> y
[1] "1970-12-24 New Zealand Standard Time"
> z <- seq(x, by="year", length=2)[2]
> difftime(y, z)
Time difference of 1 hours

>From reading 'library/base/html/DateTimeClasses.html' I understand that R
queries Windows XP for the period 1970-2037 and uses its own C code
outside this range. To me it looks like the bug might be in the C code. I
think the problem goes deeper than print.POSIXct, because coercions are
independent of the print method as far as I understand:

> x <- as.POSIXlt("1969-12-24")
> y <- as.POSIXlt(as.POSIXct(x))
> x
[1] "1969-12-24"
> y
[1] "1969-12-23 23:00:00 New Zealand Standard Time"
> x==y
[1] FALSE

(2) Time zone confusion

This is a question of taste and implementation, but I would like POSIXct
objects in data frames and plots to display the times I entered
originally, regardless of where in the world I'm working at the moment.

(3) My solution

I've defined the environment variable tz=GMT in my R shortcut and I'll try
to remember doing so on other machines I work on. This way I avoid both
the as.POSIXct/New Zealand bug and the time zone confusion. I still think
the Kiwis would appreciate the bug being fixed :)

> On Fri, 21 Feb 2003 arnima@u.washington.edu wrote:
>
> > Full_Name: Arni Magnusson
> > Version: 1.6.2
> > OS: Windows XP
> > Submission from: (NULL) (210.48.49.68)

--__--__--

Message: 10
From: "Henrik Bengtsson" <hb@maths.lth.se>
To: "'Duncan Murdoch'" <murdoch@stats.uwo.ca>
Cc: <r-devel@stat.math.ethz.ch>
Subject: RE: [Rd] Re: [R] Who to decide what a generic function should look
like?
Date: Sat, 22 Feb 2003 20:03:41 +1100

> -----Original Message-----
> From: r-devel-admin@stat.math.ethz.ch
> [mailto:r-devel-admin@stat.math.ethz.ch] On Behalf Of Duncan Murdoch
> Sent: den 22 februari 2003 03:15
> To: Henrik Bengtsson
> Cc: r-devel@stat.math.ethz.ch
> Subject: [Rd] Re: [R] Who to decide what a generic function
> should look like?
>
>
> On Thu, 20 Feb 2003 13:05:44 +1100, you wrote in message
> <000d01c2d884$98690fa0$7341a8c0@alpha.wehi.edu.au>:
>
> >I am not sure if what I am asking below should be discussed under
> >r-help or r-devel, so please feel free to move over to r-devel.
>
> I've done that, I think it's a more r-devel kind of topic.
>
> >For me a generic function should be fully generic in the sense that
> >there are no requirements of arguments agreement (and therefore it
> >should not be documented as a reply to Smyth's thread).
>
> I don't agree.  A generic function has a meaning.  Often that
> meaning is expressed in terms of certain arguments.  If a
> user of an unknown object knows that it supports the generic,
> they have a right to expect it to behave according to the
> standard meaning of the generic.

I understand this viewpoint too, but I tend to think about it as
follows. Consider a hierarchy of all possible classes. Excluding
multiple heritance, they could I principle be place in a tree structure
with a root class, which all classes directly or indirectly inherits
from (this is the idea in for instance Java). In R, such a root class
could have the methods print(), as.character() and a few other methods
that you expect all R objects to have. From there on you add new
classes. With this class hierarchy I think about generic functions (as
they work today) as methods that are placed in the root object. However,
why can they not be further down the tree where they can be documented
specifically for that subtree and provide polymorphism from there on?
This should still be (be even more) intutive for the end user I think.

So not thinking about S3 or S4, but in a longer term, is there a reason
for not going the whole way and having a method dispatching mechanism
that is totally general like in other object-oriented languages? When
asking this, I might reveal that I am not fully confortable with the
differences and pros & cons between scientific programming (with
vectors) and standard object-oriented programming.

> >My concern is that enforcing methods to match the argument
> signature of
> >the generic function will make packages incompatible with
> each other. I
> >can not create a generic function called "normalize" for my
> microarray
> >package and expect it to work together with other package defining a
> >generic function with the same name. Some short-term and long-term
> >outcomes from this are:
>
> That's only a short term problem.  As namespaces arrive, it
> will go away.  Your normalize will not trample on anyone
> else's normalize, because your names will live in a different
> namespace.  Hopefully the default behaviour will be
> reasonable (i.e. if I say "normalize", and only one version
> is around, I'll get it; if there are two, there'll be either
> a clear way to choose, or a warning or error about the ambiguity).

This is promising and I am really glad to hear that the problem of
conflicts might be solved. The problem remains though if one decides to
have to different types of classes in one package both with a
normalize() method, meaning that you might have to split up your package
into a bundle of packages. However, this is a much smaller problem since
it can be controlled by the developer.

> >  * who is the person to decide what a generic function should look
> >like, and
> >  * who owns the right to the method name "normalize"?
>
> The author of the package makes the decisions and owns the
> names in that package.
>
>
> Duncan Murdoch

Duncan, thanks a lot for your reply!

Henrik

--__--__--

_______________________________________________
R-devel mailing list  DIGESTED
R-devel@stat.math.ethz.ch
http://www.stat.math.ethz.ch/mailman/listinfo/r-devel

End of R-devel Digest