[R] Sub_scribe and a question

LIQING ZHANG lzhang at uci.edu
Fri Feb 16 05:49:07 CET 2001


Dear all,

I am trying to get an estimate of the intercept for a linear model.  In
this case, I know the slope of the model, can anyone tell me how to
constrain the formula in lm() so that it only estimates the intercept
not the slope?  Many thanks in advance,

Sincerely,

Liqing Zhang
Dept. of Eco. Evol. Biol.
Univ. of CA, Irvine
email: lzhang at uci.edu

>From VM Mon Apr 30 08:18:45 2001
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["22492" "Saturday" "28" "April" "2001" "12:31:48" "+0200" "owner-r-help at stat.math.ethz.ch" "owner-r-help at stat.math.ethz.ch" "<200104281031.MAA29928 at stat.math.ethz.ch>" "642" "BOUNCE r-help at stat.math.ethz.ch:     global taboo body match \"/^Content-Type: *application\\/octet-stream/i\" at line 241  " "^From:" nil nil "4" "2001042812:31:48" "BOUNCE r-help at stat.math.ethz.ch: global taboo body match \"/^Content-Type: *application\\/octet-stream/i\" at line 241" nil nil nil]
	nil)
Received: by stat.math.ethz.ch (8.9.1/8.9.1) id MAA29928;
	Sat, 28 Apr 2001 12:31:48 +0200 (MET DST)
Message-Id: <200104281031.MAA29928 at stat.math.ethz.ch>
From: owner-r-help at stat.math.ethz.ch
To: owner-r-help at stat.math.ethz.ch
Subject: BOUNCE r-help at stat.math.ethz.ch:     global taboo body match "/^Content-Type: *application\/octet-stream/i" at line 241  
Date: Sat, 28 Apr 2001 12:31:48 +0200 (MET DST)

>From owner-r-help  Sat Apr 28 12:31:27 2001
Received: (from daemon at localhost)
	by stat.math.ethz.ch (8.9.1/8.9.1) id MAA29894
	for <r-help at stat.math.ethz.ch>; Sat, 28 Apr 2001 12:31:26 +0200 (MET DST)
Received: from riker.skynet.be(195.238.3.132)
 via SMTP by hypatia, id smtpdAAAa007H.; Sat Apr 28 12:31:24 2001
Received: from NEMO (adsl-34755.turboline.skynet.be [217.136.7.195])
	by riker.skynet.be (8.11.2/8.11.2/Skynet-OUT-2.11) with SMTP id f3SAVC110474;
	Sat, 28 Apr 2001 12:31:12 +0200 (MET DST)
	(envelope-from <phgrosje at ulb.ac.be>)
From: "Philippe Grosjean" <phgrosje at ulb.ac.be>
To: "Prof Brian D Ripley" <ripley at stats.ox.ac.uk>
Cc: <r-help at stat.math.ethz.ch>
Subject: RE: [R] Benchmarking R, why sort() is so slow?
Date: Sat, 28 Apr 2001 12:32:12 +0200
Message-ID: <MABBLJDICACNFOLGIHJOIECICDAA.phgrosje at ulb.ac.be>
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_NextPart_000_0025_01C0CFDF.3FCC3CA0"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Importance: Normal
In-Reply-To: <Pine.GSO.4.31.0104280911500.18385-100000 at auk.stats>

C'est un message de format MIME en plusieurs parties.

------=_NextPart_000_0025_01C0CFDF.3FCC3CA0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Well, I understand and share your point of view about benchmarks, including
the one I am working on. Much more can be said also: raw speed is not all,
and perhaps also doesn't tell much about usability and performance in real
use. Moreover, it is perhaps a little silly to compare packages like R or
Splus, mainly intended for statistics with Mathematica, for instance,
designed principally for symbolic mathematics. This is something I want to
think about before releasing this benchmark in public.

A point I must clarify: among the 11 tested packages, my favorite is R,...
by far. But I try to stay objective. I am just a bit puzzled about this
outlier in the results that is the sort() test in R. This is not
representative at all of the performances of R, which are very good, as
attested by the 14 other tests (see column F in the table hereunder).

Now, one can argue also that all these tests are just like random probes
that look for global performances, but also for homogeneity in the
performances of the various algorithms used. The question is not how useful
is it to sort 1e6 random numbers in a particular context. The question is,
from this point of view, sort() is one of the basic functions in these kind
of packages (but, I agree, there are many many other, and perhaps, more
important functions) and as such, it could be considered as an interesting
benchmark test... among others. Now, if you look at the results, you will
notice immediatelly that some packages are more homogeneous than others (as
a reminder, tests have been scaled so as to be done in about 1 sec +/- 10%
in the reference software -well, one can critisize this choice but...- that
is Matlab 6.0 R12).

The point also is not to say that 21 sec is too long, or long enough, or
what to sort 1e6 random numbers. The point is that various packages are
_compared_ on the same level (i.e.: same task to do on the same computer).
With 11 packages, which represent perhaps 60-80% of all available ones for
high-level language matrix calculation, it is possible to have a quite good
idea of overall performances. Again, R is an outlier for the sort test (see
row 3 of the table). Now, I decided to use trimmed means to get rid of such
outliers, so overall R results are not penalized by this test. However, I am
not the kind of guy to traw away outliers without asking if they have
something to tell us. That's what I am doing here.

I hope nobody feel agressed by my approach, because this is not my goal at
all. My current point of view is that sort() test in R could really be
considered as an outlier (I mean, to chose other tests would certainly
reveal this kind of conter-performances in other functions for other
packages). I think however that it is useful to signal it to the R core
team, and that it should be perhaps useful too to reconsider the sorting
algorithm used in R in a future version, in regard with these results. I am
not competent enough for that, or I would have gladly contributed myself in
working on this problem. OK, there are certainly many priorities on the "to
do list", but... what about using the same algorithm as Splus? or reusing
Scilab code which is GPL too? If there are good reasons to keep the current
sort() algorithm in R (better in other situations, I don't know, I haven't
tested so extensively), at least a second alternative could be proposed,
which could be welcome in situations like this (a ten times improvement is
not to be neglected, isn't it?).

Software released under a GPL license, like R, must be encouraged and I
don't want to have a bad influence in any way with this benchmark. One must
admit that Mathworks did an amazing job, both on the performances level
(compare column I, Matlab 5.3 R11 with column C, Matlab 6.0 R12) and on the
user interface in the recently released Matlab 6.0. Matlab is partly based
on public domain libraries... and Matlab is much too expensive, from my
point of view. The same could be said about Splus. In the opposite, R would
probably receive a first position in a top-quality/low price contest, in my
opinion. Freedom (in this case, to use, share, contribute to a software) is
a major quality... unfortunately not evaluated in such a raw benchmark. I
hope I could make just a little positive contribution to R in signaling this
weakness in sort() performances.

Now, I would gladly receive any other (good or bad, but constructive,
please) criticisms about this benchmark. I join in this mail the script I
used to test R. If someone would like to receive scripts for other packages,
I could give them too. They will be available soon, anyway, when my work
will be done.


Temporary version of a benchmark:
---------------------------------
Computer = Intel Celeron 500 Mhz with 256 Mb Ram under Windows 2000 sp1 (all
tests run 3 times)

Tested software: A = O-Matrix 5.1 (native mode), B = O-Matrix 5.1 (Matlab
mode), C = Matlab 6.0 R12, D = Ox 2.20, E = Mathematica 4.01, F = R 1.2.2, G
= Octave 2.1.31, H = Splus 2000 pro R3, I = Matlab 5.3 R11, J = Rlab 2.1, K
= Scilab 2.6, L = Axum 6.0 R2 (Splus mode)

Tests: 3 series (I. matrix calculation / II. preprogrammed matrix functions
/ III. programmation) of 5 tests each. A trimmed mean is considered for each
serie (best and worst scores eliminated). The tests are scaled to be done in
approx. 1 sec on the reference computer with the reference software being
Matlab 6.0 R12 (column C).

Serie I: Matrix calculation
1 = Creation, transposition, deformation of a 1200x1200 matrix, 2 =
1250x1250 normal distributed random matrix ^1000, 3 = Sorting of 1,100,000
random values, 4 = 550x550 cross-product matrix (b = a'  * a), 5 = Linear
regression over a 700x700 matrix (b = a \ b')

Serie II: Preprogrammed matrix functions
6 = FFT over 900,000 random values, 7 = Eigenvalues of a 220x220 random
matrix, 8 = Determinant of a 750x750 random matrix, 9 = Cholesky
decomposition of a 1000x1000 matrix, 10 = Inverse of a 500x500 random matrix

Serie II: programmation
11 = 225.000 Fibonacci numbers calculation (vector calc.), 12 = Creation of
a 1500x1500 Hilbert matrix (matrix calc.), 13 = Grand common divisors of
35,000 pairs (recursion), 14 = Creation of a 220x220 Toeplitz matrix
(loops), 15 = Escoufier's method on a 22x22 matrix (mixed)

Times are in sec.



Software   A     B     C     D     E     F(R)  G     H     I     J     K
L

Serie I.
--------
1         1.03  1.35  0.95  1.23  2.19  2.79  4.24  2.91  1.00  1.92  2.95
2.67
2         2.60  2.48  0.99  2.91  0.76  2.88  2.76  1.41  2.61  2.38  2.32
1.42
3         1.09  1.10  1.04  3.37  3.13 21.71  6.33  1.84  7.91  4.00  2.72
1.65
4         0.85  0.86  0.98  1.84  4.04  3.88  7.71  1.74  4.57  8.92 14.84
1.61
5         0.72  0.96  1.07  5.47  2.89  5.75  4.32 10.84  6.91  6.31  7.67
9.70
----------------------------------------------------------------------------
----
tr. mean  0.99  1.14  1.01  2.70  2.74  4.17  4.96  2.16  4.70  4.23  4.45
1.98

Serie II.
---------
6         2.21  2.16  0.99  3.83  1.94  2.67  2.70  4.81  4.04  2.23  4.06
4.17
7         0.42  0.42  0.90  1.11  0.95  1.49  2.46  1.68  1.41  3.09  2.94
2.19
8         0.88  0.92  1.06  3.31  5.90  5.98  5.19  2.04  7.96  7.50  9.14
117.0
9         0.96  0.96  1.29  3.74  4.30  3.65  6.54 11.38  4.59  6.64  7.81
8.98
10        0.75  0.76  1.10  2.94  5.24  5.83  3.92  2.48  6.18  7.65  7.61
11.04
----------------------------------------------------------------------------
----
tr.mean   0.75  0.88  1.05  3.33  3.83  4.05  3.94  3.48  4.93  5.74  6.49
8.06

Serie III.
----------
11        0.52  0.39  1.04  0.52  0.41  0.55  1.02  0.96  0.97  0.49  0.60
0.72
12        1.79  2.09  0.98  1.03  3.14  1.30  0.79  2.02  2.03  1.21  1.92
1.86
13        0.38  0.43  1.03  0.76   ?    1.33  0.87  0.71  0.92  6.03  1.23
0.93
14        0.20  0.26  0.92  0.13  1.50  2.56  6.89 14.49  0.78  0.63  4.52
13.46
15        0.22  0.24  1.00  0.13   ?    0.48  1.66 12.12  1.06  0.73  1.32
10.88
----------------------------------------------------------------------------
----
tr. mean  0.79  0.36  1.00  0.47  1.68  1.06  1.19  5.03  0.99  0.86  1.49
4.56
----------------------------------------------------------------------------
----
total    14.62 15.39 15.34 32.33 36.38 62.84 57.40 71.43 52.95 59.73 71.65
188.3
----------------------------------------------------------------------------
----
ovr. mean 0.84  0.79  1.02  2.17  2.75  3.09  3.36  3.44  3.54  3.61  4.14
4.87

Sorry for being so long.

Philippe


-----Message d'origine-----
De : ripley at auk.stats [mailto:ripley at auk.stats]De la part de Prof Brian
D Ripley
Envoye : samedi 28 avril 2001 10:27
A : Philippe Grosjean
Cc : Peter Dalgaard BSA; r-help at stat.math.ethz.ch
Objet : RE: [R] Benchmarking R, why sort() is so slow?


On Fri, 27 Apr 2001, Philippe Grosjean wrote:

> I suspect it should be a question of algorithm choice. May be sort() is
> important enough, including for large datasets, to place some improvement
in
> the "to do" list for a further version...?

I find it hard to believe that doing anything useful with a 1.1 million
length vector will not take very much longer than complete sorting, which
is a rare problem in statistics (not needed for quantiles, for example).
That's the problem with these benchmarks, they don't test realistic
problems because many of the packages concerned would need days of coding
to run one.

FWIW, there are O(n log n) sort algorithms, worst case, that are fairly
competitive on average too.  But testing random vectors is not
representation of real problems, and for many such (and even for
rnorm(1e6)) one can do even better by e.g radix sorting.  So if you know
something about the input you make your package look better. Again,
a problem with benchmarking.

>> Peter Dalgaard

> The internal algorithm is a shellsort, which is supposedly of
> complexity O(n^1.25) and has decent worst-case behaviour. Other
> algorithms like quicksort have typical performance of O(n log n) but
> extreme cases of O(n^2).
>
> For large vectors the O(n^.25/log(n)) relative complexity is going to
> make a difference, although the observed differences seem larger.

There are several shellsort variants.  I would need to look up the
known results for precisely this one.

> There are very likely better choices of sort algorithm...


--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


------=_NextPart_000_0025_01C0CFDF.3FCC3CA0
Content-Type: application/octet-stream;
	name="R.R"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="R.R"

# R Benchmark (3 April 2001)
# Author : Philippe Grosjean
# eMail  : phgrosjean at sciviews.org
# Web    : http://www.sciviews.org
# License: GPL 2 or above at your convenience (see: http://www.gnu.org)
#
# Several tests are adapted from the Splus Benchmark Test V. 2
# by Stephan Steinhaus (stst at informatik.uni-frankfurt.de)                =
              #=20
# Reference for Escoufier's equivalents vectors (test III.5):
# Escoufier Y., 1970. Echantillonnage dans une population de variables
# aleatoires r=E9elles. Publ. Inst. Statis. Univ. Paris 19 Fasc 4, 1-47.
#
# Note that you need to install package 'Matrix' for this test
# Use also R >=3D 1.2, otherwise you will have to change -nsize and =
-vsize
# parameters at startup (R versions prior to 1.2 have a fixed amount of
# memory which is too low by default for some tests)
#
# type source("c:/<dir>/R.R") to start the test

runs <- 3			# Number of times the tests are executed
times <- rep(0, 15); dim(times) <- c(5,3)
require(Matrix)
options(object.size=3D100000000)

cat("   R Benchmark\n")
cat("   =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n")
cat(c("Number of times each test is run__________________________: ", =
runs))
cat("\n\n")


cat("   I. Matrix calculation\n")
cat("   ---------------------\n")
flush.console()

# (1)
cumulate <- 0; a <- 0; b <- 0
for (i in 1:runs) {
  timing <- system.time({
    a <- abs(matrix(rnorm(1200*1200, sd=3D0.1), ncol=3D1200, =
nrow=3D1200));
    b <- t(a);
    dim(b) <- c(600, 2400);
    a <- t(b)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[1, 1] <- timing
cat(c("Creation, transp., deformation of a 1200x1200 matrix (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (2)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- abs(matrix(rnorm(1250*1250, sd=3D0.5), ncol=3D1250, =
nrow=3D1250));
  timing <- system.time({=20
    b <- a^1000=20
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[2, 1] <- timing
cat(c("1250x1250 normal distributed random matrix ^1000____ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (3)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(1100000)
  timing <- system.time({
    b <- sort(a)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[3, 1] <- timing
cat(c("Sorting of 1,100,000 random values__________________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (4)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(550*550); dim(a) <- c(550, 550)
  timing <- system.time({
    b <- crossprod(a, a)		# equivalent to: b <- t(a) %*% a
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[4, 1] <- timing
cat(c("550x550 cross-product matrix (b =3D a' * a)___________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (5)
cumulate <- 0; c <- 0; qra <-0
for (i in 1:runs) {
  a <- rnorm(700*700); dim(a) <- c(700,700)
  b <- 1:700
  timing <- system.time({
    qra <- qr(a, tol =3D 1e-10);
    c <- qr.coef(qra, b)
    #Rem: a little faster than c <- lsfit(a, b, inter=3DF)$coefficients
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[5, 1] <- timing
cat(c("Linear regression over a 700x700 matrix (c =3D a \\ b') (sec): ", =
timing, "\n"))
remove("a", "b", "c", "qra")
flush.console()

times[ , 1] <- sort(times[ , 1])
cat("                      =
--------------------------------------------\n")
cat(c("                      Trimmed mean (2 extremes eliminated): ", =
mean(times[2:4, 1]), "\n\n"))

cat("   II. Matrix functions\n")
cat("   --------------------\n")
flush.console()

# (1)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(900000)
  timing <- system.time({
    b <- fft(a)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[1, 2] <- timing
cat(c("FFT over 900,000 random values______________________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (2)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(220*220); dim(a) <- c(220, 220)
  Matrix.class(a)
  timing <- system.time({
    b <- eigen.Matrix(a, vectors =3D F)$Value
    # Rem: b <- eigen.default(a, vectors =3D F)$Value is slower!
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[2, 2] <- timing
cat(c("Eigenvalues of a 220x220 random matrix______________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (3)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(750*750); dim(a) <- c(750, 750)
  Matrix.class(a)
  timing <- system.time({
    b <- det.Matrix(a, logarithm=3DF)
    # Rem: b <- det.default(a) is slower!
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[3, 2] <- timing
cat(c("Determinant of a 750x750 random matrix______________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (4)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(1000*1000); dim(a) <- c(1000, 1000)
  a <- crossprod(a, a)
  timing <- system.time({
    b <- chol(a)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[4, 2] <- timing
cat(c("Cholesky decomposition of a 1000x1000 matrix________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (5)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  a <- rnorm(500*500); dim(a) <- c(500, 500)
  timing <- system.time({
    b <- qr.solve(a)
    # Rem: a little faster than b <- solve(a)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[5, 2] <- timing
cat(c("Inverse of a 500x500 random matrix__________________ (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

times[ , 2] <- sort(times[ , 2])
cat("                      =
--------------------------------------------\n")
cat(c("                      Trimmed mean (2 extremes eliminated): ", =
mean(times[2:4, 2]), "\n\n"))

cat("   III. Programmation\n")
cat("   ------------------\n")
flush.console()

# (1)
cumulate <- 0; a <- 0; b <- 0; phi <- 1.6180339887498949
for (i in 1:runs) {
  a <- floor(runif(225000, max=3D1000))
  timing <- system.time({
    b <- (phi^a - (-phi)^(-a))/sqrt(5)
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[1, 3] <- timing
cat(c("225,000 Fibonacci numbers calculation (vector calc)_ (sec): ", =
timing, "\n"))
remove("a", "b", "phi")
flush.console()

# (2)
cumulate <- 0; a <- 1500; b <- 0
for (i in 1:runs) {
  timing <- system.time({
    b <- rep(1:a, a); dim(b) <- c(a, a);
    b <- 1 / (t(b) + 0:(a-1))
    # Rem: this is twice as fast as the following code proposed by R =
programmers
    # a <- 1:a; b <- 1 / outer(a - 1, a, "+")
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[2, 3] <- timing
cat(c("Creation of a 1500x1500 Hilbert matrix (matrix calc) (sec): ", =
timing, "\n"))
remove("a", "b")
flush.console()

# (3)
cumulate <- 0; c <- 0
gcd2 <- function(x, y) {if (sum(y > 1.0E-4) =3D=3D 0) x else {y[y =3D=3D =
0] <- x[y =3D=3D 0]; gcd2(y, x %% y)}}
for (i in 1:runs) {
  a <- ceiling(runif(35000, max=3D1000))
  b <- ceiling(runif(35000, max=3D1000))
  timing <- system.time({	 =20
    c <- gcd2(a, b)                            # gcd2 is a recursive =
function
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[3, 3] <- timing
cat(c("Grand common divisors of 35,000 pairs (recursion)___ (sec): ", =
timing, "\n"))
remove("a", "b", "c", "gcd2")
flush.console()

# (4)
cumulate <- 0; b <- 0
for (i in 1:runs) {
  b <- rep(0, 220*220); dim(b) <- c(220, 220)
  timing <- system.time({
  	# Rem: there are faster ways to do this
  	# but here we want to time loops (220*220 'for' loops)!=20
    for (j in 1:220) {
      for (k in 1:220) {
        b[k,j] <- abs(j - k) + 1
      }
    }
  })[3]
  cumulate <- cumulate + timing
}
timing <- cumulate/runs
times[4, 3] <- timing
cat(c("Creation of a 220x220 Toeplitz matrix (loops)_______ (sec): ", =
timing, "\n"))
remove("b", "j", "k")
flush.console()

# (5)
cumulate <- 0; p <- 0; vt <- 0; vr <- 0; vrt <- 0; rvt <- 0; RV <- 0; j =
<- 0; k <- 0;
x2 <- 0; R <- 0; Rxx <- 0; Ryy <- 0; Rxy <- 0; Ryx <- 0; Rvmax <- 0
# Calculate the trace of a matrix (sum of its diagonal elements)
Trace <- function(y) {sum(c(y)[1 + 0:(min(dim(y)) - 1) * (dim(y)[1] + =
1)], na.rm=3DFALSE)}
for (i in 1:runs) {
  x <- abs(rnorm(22*22)); dim(x) <- c(22, 22)
  timing <- system.time({
    # Calculation of Escoufier's equivalent vectors
    p <- ncol(x)
    vt <- 1:p                                  # Variables to test
    vr <- NULL                                 # Result: ordered =
variables
    RV <- 1:p                                  # Result: correlations
    vrt <- NULL
    for (j in 1:p) {                           # loop on the variable =
number
      Rvmax <- 0
      for (k in 1:(p-j+1)) {                   # loop on the variables
        x2 <- cbind(x, x[,vr], x[,vt[k]])
        R <- cor(x2)                           # Correlations table
        Ryy <- R[1:p, 1:p]
        Rxx <- R[(p+1):(p+j), (p+1):(p+j)]
        Rxy <- R[(p+1):(p+j), 1:p]
        Ryx <- t(Rxy)
        rvt <- Trace(Ryx %*% Rxy) / sqrt(Trace(Ryy %*% Ryy) * Trace(Rxx =
%*% Rxx)) # RV calculation
        if (rvt > Rvmax) {
          Rvmax <- rvt                         # test of RV
          vrt <- vt[k]                         # temporary held variable
        }
      }
      vr[j] <- vrt                             # Result: variable
      RV[j] <- Rvmax                           # Result: correlation
      vt <- vt[vt!=3Dvr[j]]                      # reidentify variables =
to test
    }
  })[3]
  cumulate <- cumulate + timing
}
times[5, 3] <- timing
cat(c("Escoufier's method on a 22x22 matrix (mixed)________ (sec): ", =
timing, "\n"))
remove("x", "p", "vt", "vr", "vrt", "rvt", "RV", "j", "k")
remove("x2", "R", "Rxx", "Ryy", "Rxy", "Ryx", "Rvmax", "Trace")=20
flush.console()

times[ , 3] <- sort(times[ , 3])
cat("                      =
--------------------------------------------\n")
cat(c("                      Trimmed mean (2 extremes eliminated): ", =
mean(times[2:4, 3]), "\n\n\n"))

cat(c("Total time for all 15 tests_________________________ (sec): ", =
sum(times), "\n"))
cat(c("Overall mean (sum of I, II and III trimmed means/3)_ (sec): ", =
mean(times[2:4, ]), "\n"))
remove("cumulate", "timing", "times", "runs", "i")
cat("                      --- End of test ---\n\n")  =20

------=_NextPart_000_0025_01C0CFDF.3FCC3CA0--

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list