[R] "unsparse" a vector

Sam Steingold sds at gnu.org
Wed Feb 8 23:01:01 CET 2012


loop is too slow.
it appears that sparseMatrix does what I want:

ll <- lapply(l,length)
i <- rep(1:4, ll)
vv <- unlist(l)
j1 <- as.factor(substring(vv,1,1))
t <- table(j1)
j <- position of elements of j1 in names(t)
sparseMatrix(i,j,x=as.numeric(substring(vv,2,2)), dimnames = names(t))

so, the question is, how do I produce a vector of positions?

i.e., from vectors
[1] "A" "B" "A" "C" "A" "B"
and
[1] "A" "B" "C"
I need to produce a vector
[1] 1 2 1 3 1 2
of positions of the elements of the first vector in the second vector.

PS. Of course, I would much prefer a dataframe to a matrix...

> * Sam Steingold <fqf at tah.bet> [2012-02-08 15:56:12 -0500]:
>
> To be clear, I can do that with nested for loops:
>
> v <- c("A1B2","A3C4","B5","C6A7B8")
> l <- strsplit(gsub("(.{2})","\\1,",v),",")
> d <- data.frame(A=vector(length=4,mode="integer"),
>                 B=vector(length=4,mode="integer"),
>                 C=vector(length=4,mode="integer"))
>
> for (i in 1:length(l)) {
>   l1 <- l[[i]]
>   for (j in 1:length(l1)) {
>     d[[substring(l1[j],1,1)]][i] <- as.numeric(substring(l1[j],2,2))
>   }
> }
>
>
> but I am afraid that handling 1,000,000 (=length(unlist(l))) strings in
> a loop will kill me.
>
>
>> * Sam Steingold <fqf at tah.bet> [2012-02-08 15:34:38 -0500]:
>>
>> Suppose I have a vector of strings:
>> c("A1B2","A3C4","B5","C6A7B8")
>> [1] "A1B2"   "A3C4"   "B5"     "C6A7B8"
>> where each string is a sequence of <column><value> pairs
>> (fixed width, in this example both value and name are 1 character, in
>> reality the column name is 6 chars and value is 2 digits).
>> I need to convert it to a data frame:
>> data.frame(A=c(1,3,0,7),B=c(2,0,5,8),C=c(0,4,0,6))
>>   A B C
>> 1 1 2 0
>> 2 3 0 4
>> 3 0 5 0
>> 4 7 8 6
>>
>> how do I do that?
>> thanks.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://honestreporting.com http://truepeace.org http://openvotingconsortium.org
http://iris.org.il http://jihadwatch.org http://camera.org
Failure is not an option. It comes bundled with your Microsoft product.



More information about the R-help mailing list