[R] help with colsplit (reshape)

hadley wickham h.wickham at gmail.com
Fri Jun 13 20:09:28 CEST 2008


> M.Data2 <- data.frame(M.Data, colsplit(M.Data$variable, split = "\\.", names
> = c("treatment", "time")))
>
> which gave:
>
> head(M.Data2)
>  pid variable value treatment  time
> 1   1    predA    -1     predA predA
> 2   2    predA    -2     predA predA
> 3   3    predA    -1     predA predA
> 4   4    predA    -2     predA predA
> 5   5    predA    -1     predA predA
> 6   6    predA    -2     predA predA
>
> Closer but no cigar.

Have a look at the whole thing - it's getting it right most of the
time.  Going back to the original variable names, I see that "PredA"
does not have a time associated with it.  What do you expect the time
to be?

> I would be grateful if someone will tell me (a) how to reshape the data as
> described above using the reshape package, (b) what difference between split
> = "." and split = "\\." is,

The splitting argument is a regular expression, and in regular
expression speak "." means to match any one character.  "\\." escapes
the full stop, so it only matches full stops.

> and (c) if more information about the colsplit
> command is available anywhere.

Probably the best way is just to look at the code (it's pretty simple):

> colsplit.character
function (x, split = "", names)
{
    vars <- as.data.frame(do.call(rbind, strsplit(x, split)))
    names(vars) <- names
    as.data.frame(lapply(vars, function(x) type.convert(as.character(x))))
}

If strsplit doesn't do what you want, you might need to write your own
function following those lines.

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list