[R] vectorisation suggestion

Wiener, Matthew matthew_wiener at merck.com
Mon Jun 20 22:38:10 CEST 2005


Federico -

"match" will give you the (first) index of each element of its first
argument in its second argument.  So  match(vector.1, vector.2)  tells you
where each element of vector.1 appears in vector.2.  So if you use "table"
on that vector, you'll see how many times each element of vector.2 appears. 

Something like:

first.occur <- match(vector.1, vector.2)
table(factor(vector.2[first.occur], levels = sort(unique(vector.2)))

(changing it into a factor means you won't lose values of vector.2 that
never appear in vector.1.)

An example:

> vec1 <- sample(1:10, 500, replace = TRUE)
> table(vec1)
vec1
 1  2  3  4  5  6  7  8  9 10 
61 37 43 47 49 59 51 48 53 52 
> vec2 <- 0:11
> vec3 <- match(vec1, vec2)
> table(factor(vec2[vec3], levels = sort(unique(vec2))))
 0  1  2  3  4  5  6  7  8  9 10 11 
 0 61 37 43 47 49 59 51 48 53 52  0 
> 

This also works if one of the members of vec1 is not in vec2 -- that member
simply gets ignored. (As you can see if add, say, a "20" at the end of
vec1.)

Hope this helps,

Matt Wiener

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Federico Calboli
Sent: Monday, June 20, 2005 4:16 PM
To: r-help
Subject: [R] vectorisation suggestion


Hi All,

I am counting the number of occurrences of the terms listed in one  
vector in another vector.

My code runs:

for( i in 1:length(vector3)){
   vector3[i]  = sum(1*is.element(vector2,  vector1[i]))
}

where

vector1 = vector containing the terms whose occurrences I want to count
vector2 = made up of a number of repetitions of all the elements of  
vector1
vector3 = a vector of NAs that is meant to get the result of the  
counting

My problem is that vector1 is about 60000 terms, and vector2 is  
620000... can anyone suggest a faster code than the one I wrote?

Cheers,

Federico Calboli


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list