[R] levels of comma separated data
analyst41 at hotmail.com
analyst41 at hotmail.com
Fri May 25 13:23:02 CEST 2012
On May 25, 4:46 am, Stefan <ste... at inizio.se> wrote:
> analyst41 <at> hotmail.com <analyst41 <at> hotmail.com> writes:
>
>
>
> > I have a data set that has some comma separated strings in each row.
> > I'd like to create a vector consisting of all distinct strings that
> > occur. The number of strings in each row may vary.
>
> > Thanks for any help.
>
> #
> #
> # Some data:
> d <- data.frame(id = 1:5,
> text = c('one,two',
> 'two,three,three,four',
> 'one,three,three,five',
> 'five,five,five,five',
> 'one,two,three'),
> stringsAsFactors = FALSE
> )
> #
> #
> # A function. I'm not a black belt at this, so there
> # are probably a more efficient way of writing this.
> fcn <- function(x){
> a <- strsplit(x, ',') # Split the string by comma
> unique(a[[1]]) # Uniquify the vector}
>
> #
> #
> # Use the function with sapply.
> sapply(d[,2], fcn)
>
Thanks - but this solves a slightly different problem - it outputs the
unique values in each row. I want a list of the unique values in the
whole data frame.
In this case the output should be a single vector =
c("one","two","three","four","five").
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list