[R] Alternative to Scale Function?
Gavin Simpson
gavin.simpson at ucl.ac.uk
Fri Sep 11 22:57:02 CEST 2009
On Fri, 2009-09-11 at 13:10 -0700, Noah Silverman wrote:
> Hi,
>
> Is there an alternative to the scale function where I can specify my own
> mean and standard deviation?
A couple of calls to sweep?
See ?sweep
set.seed(123)
dat <- data.frame(matrix(runif(10*10), ncol = 10))
xbar <- colMeans(dat)
sigma <- apply(dat, 2, sd)
dat.std <- sweep(sweep(dat, 2, xbar, "-"), 2, sigma, "/")
## compare
scale(dat)
HTH
>
> I've come across an interesting issue where this would help.
>
> I'm training and testing on completely different sets of data. The
> testing set is smaller than the training set.
>
> Using the standard scale function of R seems to introduce some error.
> Since it scales data WITHIN the set, it may scale the same number to
> different value since the range in the training and testing set may be
> different.
>
> My thought was to scale the larger training set of data, then use the
> mean and SD of the training data to scale the testing data according to
> the same parameters. That way a number will transform to the same
> result regardless of whether it is in the training or testing set.
>
> I can't be the first one to have looked at this. Does anyone know of a
> function in R or if there is a scale alternative where I can control the
> parameters?
>
> Thanks!
>
> --
> Noah
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list