[Rd] Function some()

Gabor Grothendieck ggrothendieck at myway.com
Fri Sep 17 21:04:56 CEST 2004


John Fox <jfox <at> mcmaster.ca> writes:

: 
: > -----Original Message-----
: > From: r-devel-bounces <at> stat.math.ethz.ch 
: > [mailto:r-devel-bounces <at> stat.math.ethz.ch] On Behalf Of Gabor 
: > Grothendieck
: > Sent: Friday, September 17, 2004 12:52 PM
: > To: r-devel <at> stat.math.ethz.ch
: > Subject: Re: [Rd] Function some()
: > 
: > John Fox <jfox <at> mcmaster.ca> writes:
: . . .
: > 
: > 
: > Its cute but you could do it on vectors and data frames with
: > 2 function calls.  First get some test data:
: > 
: >    data(iris) 
: >    data(state)
: > 
: > # Now we have:
: > 
: >    head(sample(iris))                  # data frame
: >    head(sample(data.frame(state.x77))) # matrix 
: >    head(sample(letters))               # vector
: > 
: 
: A possible disadvantage of this approach is that it permutes the entire,
: potentially large, object before picking the presumably small sample.
: 
: > The only nuisance is that sample samples from the elements of 
: > matrices rather than from their rows thereby necessitating 
: > the conversion in the middle call to head(sample(...)).  
: > 
: > Perhaps an alternate suggestion would be to modify sample so 
: > it becomes an S3 generic with methods for matrices and data 
: > frames such that sample.matrix samples from the rows of a 
: > matrix and sample.data.frame samples from the rows of a 
: > data.frame.  Then (1) the above idiom becomes consistent 
: > across the above mentioned classes.  (2) This would also 
: > avoid burdening the base with an extra function and would (3) 
: > provide for the possibility of extending sample to other classes.
: > 
: 
: This occurred to me, too [as did providing a random argument to head()], but
: seemed a more radical proposal than introducing a simple new generic. 

I think I was wrong regarding my examples.  sample(iris) samples from
the columns of iris, not the rows, thus my examples do not work except
for the vector case.  One would have to do this:

   iris[sample(150,10),]

some() is nice since it reduces the mental load of iris[sample(150,10),] 
yet I wonder if its worth the feature creep and if we are to make change 
whether it would not be better to fix sample, as suggested, in which case

   head(sample(iris))

does work as would:

   sample(iris, 10)

If backward compatibility were an issue we could define a new name
for the new generic so that sample remains unchanged.  Optionally,
sample could be made defunct, over time.



More information about the R-devel mailing list