# [R] "unsparse" a vector

Petr Savicky savicky at cs.cas.cz
Thu Feb 9 12:35:36 CET 2012

```On Wed, Feb 08, 2012 at 05:01:01PM -0500, Sam Steingold wrote:
> loop is too slow.
> it appears that sparseMatrix does what I want:
>
> ll <- lapply(l,length)
> i <- rep(1:4, ll)
> vv <- unlist(l)
> j1 <- as.factor(substring(vv,1,1))
> t <- table(j1)
> j <- position of elements of j1 in names(t)
> sparseMatrix(i,j,x=as.numeric(substring(vv,2,2)), dimnames = names(t))
>
> so, the question is, how do I produce a vector of positions?
>
> i.e., from vectors
> [1] "A" "B" "A" "C" "A" "B"
> and
> [1] "A" "B" "C"
> I need to produce a vector
> [1] 1 2 1 3 1 2
> of positions of the elements of the first vector in the second vector.

This particular thing may be done as follows

match(c("A", "B", "A", "C", "A", "B"), c("A", "B", "C"))
[1] 1 2 1 3 1 2

> PS. Of course, I would much prefer a dataframe to a matrix...

As the final result or also as an intermediate result?

Changing individual rows in a data frame is much slower
than in a matrix.

Compare

n <- 10000
mat <- matrix(1:(2*n), nrow=n)
df <- as.data.frame(mat)

system.time( for (i in 1:n) { mat[i, 1] <- 0 } )

user  system elapsed
0.021   0.000   0.021

system.time( for (i in 1:n) { df[i, 1] <- 0 } )

user  system elapsed
4.997   0.069   5.084

This effect is specific to working with rows. Working
with the whole columns is a different thing.

system.time( {
col1 <- df[[1]]
for (i in 1:n) { col1[i] <- 0 }
df[[1]] <- col1
} )

user  system elapsed
0.019   0.000   0.019

Hope this helps.

Petr Savicky.

```