[R] Taking diff of character vectors
Barry Rowlingson
b.rowlingson at lancaster.ac.uk
Fri Mar 13 11:42:13 CET 2009
2009/3/13 Sergey Goriatchev <sergeyg at gmail.com>:
> Say I have
> nm1 <- c(rep(1,10), rep(0,10))
> then I can do:
> diff(nm1)
> to see where I have shift in value
>
> but what if I have
> nm2 <- c(rep("SPZ8", 10), rep("SPX9", 10))
>
> how can I produce the same ouput as diff(nm1) does, that is zeros
> everywhere except for one place where SPZ8 changes to SPX9 (there
> should be 1 there)?
>
> What if I have a matrix of characters like that:
> nm3 <- c(rep("GLF9", 4), rep("GLF10", 16))
> matr <- cbind(nm2, nm3)
>
> How can I efficiently create two more columns that contain zeros
> everywhere except for place where there is shift in character values?
You could convert to "factor" and then to numeric:
> nm2 <- c(rep("SPZ8", 10), rep("SPX9", 10))
> diff(as.numeric(as.factor(nm2)))
[1] 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0
It might be that your character values should really be factors
anyway - check out some R docs on what factors are and what they can
do for you. That also means you might want your matrix to be a data
frame, since a matrix can't contain your character values and numeric
0/1 values. Data frames can! If you try it with a matrix you end up
getting character zeroes and minus ones:
> matr <- cbind(nm2, nm3)
> matr = cbind(matr,c(0,diff(as.numeric(as.factor(matr[,1])))))
> matr[8:12,]
nm2 nm3
[1,] "SPZ8" "GLF10" "0"
[2,] "SPZ8" "GLF10" "0"
[3,] "SPZ8" "GLF10" "0"
[4,] "SPX9" "GLF10" "-1"
[5,] "SPX9" "GLF10" "0"
Easier with factors in data frames:
> df=data.frame(nm2=as.factor(nm2),nm3=as.factor(nm3))
> df$dnm2 = c(0,diff(as.numeric(df$nm2)))
> df[8:12,]
nm2 nm3 dnm2
8 SPZ8 GLF10 0
9 SPZ8 GLF10 0
10 SPZ8 GLF10 0
11 SPX9 GLF10 -1
12 SPX9 GLF10 0
Barry
More information about the R-help
mailing list