[Rd] scale(x, center=FALSE) (PR#14219)

Ben Bolker bolker at ufl.edu
Sat Mar 13 02:25:40 CET 2010


  Thanks Simon!

  How irritating/wrong would it be if I opened a new bug to submit my
suggested documentation patch?  As detailed below, I think the
documentation is somewhat confusing (it depends on a highly non-standard
definition of "standard deviation" ...)

  cheers
    Ben Bolker


Simon Urbanek wrote:
> On Mar 12, 2010, at 1:29 PM, Ben Bolker wrote:
> 
>>  I'm resending this after a week ... I really don't want to nag, but
>> I also would not like to see this sink below the waves.
>>
> 
> It has been closed as feature/FAQ with the note:
> "As documented on the help page!"
> 
> 
>>  Is there a preferred protocol for requesting comments without nagging too much?   I would add a comment to 14219 (and was curious to see whether it was rejected) ... I went to bugzilla, and bug 14219 doesn't seem to exist any more -- either as open or as closed -- don't know if it got lost, or thrown away, when the bug system migrated?
>>
> 
> Hmm.. there was apparently an error when importing the feature&FAQ box. Unfortunately Jitterbug left some duplicate bugs in different categories so the import was not as easy as it should be. I'll double check the IDs to see if any others are missing -- I ran import for 14219 manually now.
> 
> Thanks,
> Simon
> 
> 
>> [re: behavior of scale() when center=FALSE and scale=TRUE]
>>
>>>  Again, I agree with you that the behavior is not optimal, but it is
>>> very hard to make changes in R when the behavior is sub-optimal rather
>>> than actually wrong (by some definition).  R-core is very conservative
>>> about changes that break backward compatibility; I would like it if they
>>> chose to change the function to use standard deviation rather than
>>> root-mean-square, but I doubt it will happen (and it would break things
>>> for any users who are relying on the current definition).
>> [snip]
>>
>>> I have attached a patch
>>> file (and append the information below as well) that changes "standard
>>> deviation" back to "root mean square" and is much more explicit about
>>> this issue ... I hope R-core will jump in, critique it, and possibly use
>>> it in some form to improve (?) the documentation ...
>>>
>>>  [PS: I have written that the scaling is equivalent to sd() "if and
>>> only if" centering was done.  Technically it would also be equivalent if
>>> the column already had zero mean ...]
>>>
>> ===================================================================
>> --- scale.Rd	(revision 51180)
>> +++ scale.Rd	(working copy)
>> @@ -41,13 +41,18 @@
>>   equal to the number of columns of \code{x}, then each column of
>>   \code{x} is divided by the corresponding value from \code{scale}.  If
>>   \code{scale} is \code{TRUE} then scaling is done by dividing the
>> -  (centered) columns of \code{x} by their standard deviations, and if
>> +  (centered) columns of \code{x} by their root-mean-squares, and if
>>   \code{scale} is \code{FALSE}, no scaling is done.
>> -
>> -  The standard deviation for a column is obtained by computing the
>> -  square-root of the sum-of-squares of the non-missing values in the
>> -  column divided by the number of non-missing values minus one (whether
>> -  or not centering was done).
>> +
>> +  The root-mean-square for a (possibly centered)
>> +  column is defined as
>> +  \eqn{\sqrt{\sum(x^2)/(n-1)}}{sqrt(sum(x^2)/(n-1))},
>> +  where \eqn{x} is a vector of the non-missing values
>> +  and \eqn{n} is the number of non-missing values.
>> +  If (and only if) centering was done,
>> +  this is equivalent to \code{sd(x,na.rm=TRUE)}.
>> +  (To scale by the standard deviations without centering,
>> +  use \code{scale(x,center=FALSE,scale=apply(x,2,sd,na.rm=TRUE))}.)
>> }
>> \references{
>>   Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
>>
>> (Bump re: suggested update to scale.Rd .  Is this under
>> consideration? I'll stop pestering if it's considered
>> unacceptable, just don't want it to vanish without a trace ...)
>>
>>
>> -- 
>> Ben Bolker
>> Associate professor, Biology Dep't, Univ. of Florida
>> bolker at ufl.edu / people.biology.ufl.edu/bolker
>> GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 


-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / people.biology.ufl.edu/bolker
GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20100312/08bfe3ef/attachment.bin>


More information about the R-devel mailing list