[R] Automating the job?

ben@zoo.ufl.edu ben at zoo.ufl.edu
Thu Mar 1 15:06:05 CET 2001


  [Forwarding this to the help list so that it will be documented there]

On Wed, 28 Feb 2001, Youngser Park wrote:

> on 2/28/01 5:44 PM, ben at zoo.ufl.edu at ben at zoo.ufl.edu wrote:
>
> > library(mva)
> > maxit <- 20
> > nclust <- 5
> > nvars <- 5
> > npts <- 100
> > ndata <- 300
> > ## cluster identity of each point, center of each cluster, size of each
> > ## cluster
> > result.len <- npts+nclust*nvars+nclust
> > results <- matrix(ncol=result.len,nrow=ndata)
> > for (i in 1:ndata) {
> > mydata <- read.table(paste("data",i,sep=""))
> > results[i,] <- unlist(kmeans(mydata,nclust,maxit))
> > }


>
> Ben,
>
> I just tried it, and got this message;
>
> > source("run2")
> Error in "[<-"(*tmp*, i, , value = unlist(kmeans(mydata, nclust, maxit))) :
>     number of items to replace is not a multiple of replacement length
>
> I have no idea what this means.

  It means that, probably, the matrix had the wrong number of columns.
R's default assignment will try to duplicate data to fill up the target
structure when it is assigning, e.g.:

z <- matrix(ncol=4,nrow=5)
z[1,] <- 1:2   ## duplicates "1:2" twice to fill the row
z[1,] <- 1:3   ## gives an error because the row length is not an even
               ## multiple of the length of the assignment data.

  The assignments of nclust, nvars, npts, ndata above were arbitrary, you
need to make sure to fill them in with your own values (nclust=# of
clusters; nvars=# of variables/dimensions in your data set; npts=# of
points in your data set).  I was actually assuming all your data sets were
identical; if not, all the results vectors will be different lengths and
it would be better to put your results in a list rather than a matrix as
I have done above.

  However, given that you want to save the results of every analysis to a
separate file, you don't need to worry about this.
  You're right that using paste() is the way to go to save the output to
separate files.  What you do will depend a bit on what format you want the
output to be in, but you could do something like:

for (i in 1:ndata) {
   mydata <- read.table(paste("data",i,sep=""))
   cl <- unlist(kmeans(mydata,nclust,maxit))
   sink(paste("out",formatC(i,width=3,flag="0"),sep="")
   print(cl)
}
sink()

> Also, could you please tell me how to save this output into a file in a
> similar way, e.g. "out001", "out002", and etc.? I guess I can use "paste"
> function.
>

  Just one more silly question: are you really going to look at 300
separate output files?  Or are you going to read them into something else
for analysis?  Wouldn't it be better to save the results in one big
structure in R so you could run analyses comparing them?

  Ben

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list