[R] Pointer to covariates?
Göran Broström
gb at stat.umu.se
Thu Feb 21 09:37:09 CET 2002
On Wed, 20 Feb 2002, Gabor Grothendieck wrote:
> In the first line, use the dist function, found in library mva,
> to get the distance between each pair of rows. From this
> calculate an incidence matrix for which element i,j is true if
> row i in dat equals row j in dat (and false elsewhere).
>
> In the second line, for each row calculate the indices of
> the matching rows and take the minimum of those as the key.
>
> incid <- as.matrix(dist(dat[,-1],method="max"))==0
> keys <- unlist(lapply(apply(incid,1,which),min))
Thank you very much! This is very fast, much faster than my attempts
so far, but it has two drawbacks:
1. It gives pointers to first occurrences in the _original_ data frame,
not the 'unique' version.
2. The first step results in a _huge_ matrix 'incid', too huge for my
applications.
However, this is a promising first attempt, and I will try to refine
the idea. Again, thanks!
Göran
>
> --- Göran Broström <gb at stat.umu.se> wrote:
> >I have a dataframe 'dat' with one response and some covariates. Many
> >observations (rows), but only a few unique combinations of
> >the covariates. Let's say that the response is in column 1, and
> >the covariates in columns 2:k.
> >
> >I want to do
> >
> >> covar <- unique.data.frame(dat[, 2:k])
> >> y <- dat[, 1]
> >> keys <- ??????
> >
> >where 'keys' should be a vector of length length(y) and contain the
> >row numbers in 'covar', where the response will find its covariates.
> >
> >Example:
> >
> >> dat
> > y x1 x2
> >1 1 1 0
> >2 2 0 1
> >3 3 1 0
> >
> >> unique.data.frame(dat[, 2:3])
> > x1 x2
> >1 1 0
> >2 0 1
> >
> >> keys
> >1 1
> >2 2
> >3 1
> >
> >But how do I get 'keys'?
> >--
> > Göran Broström tel: +46 90 786 5223
> > professor fax: +46 90 786 6614
> > Department of Statistics http://www.stat.umu.se/egna/gb/
> > Umeå University
> > SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se
> >
> >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> >Send "info", "help", or "[un]subscribe"
> >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
> _____________________________________________________________
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
--
Göran Broström tel: +46 90 786 5223
professor fax: +46 90 786 6614
Department of Statistics http://www.stat.umu.se/egna/gb/
Umeå University
SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list