[R] Split values in vector
Johannes Radinger
JRadinger at gmx.at
Thu Jan 19 16:07:53 CET 2012
Hi,
just for explaining it a little bit furhter
here a small sample dataframe (similar to that
I am working with).
var1 <-seq(1,5)
var2 <-c("A","B","C","D","E")
var3 <-c("00","01-1;02-3;04-1","01-2;02-1","01-0;04-12",NA)
x <- data.frame(var1,var2,var3)
The final dataframe should look like:
When there is the category "00" then the column "00" should
be 1 and all others 0. The other values should be according
to the input and when the category is not stated then the value
is 0. Sounds probably a little bit confusing but hopefully
the example makes it easier to understand.
var1 var2 var3_00 var3_01 var3_02 var3_04
1 A 1 0 0 0
2 B 0 1 3 1
3 C 0 2 1 0
4 D 0 0 0 12
5 E NA NA NA NA
When I try it with the recommended approach I get an error
when I want it executes table() and I am not sure if I will
get exactly the result I want.
X <- unlist(strsplit(as.character(x$var3), split = ";", fixed = TRUE))
X <- strsplit( X, split = "-", fixed = TRUE)
X <- sapply( X, function( x)
if( length(x) == 2)
rep( x[1], as.numeric( x[2])) else x[1]
)
table(X, useNA = "always")
Thank you for you help, I really don't know how this can be handled....
best regards,
johannes
-------- Original-Nachricht --------
> Datum: Thu, 19 Jan 2012 13:42:24 +0100 (MET)
> Von: Gerrit Eichner <Gerrit.Eichner at math.uni-giessen.de>
> An: Johannes Radinger <JRadinger at gmx.at>
> CC: R-help at r-project.org
> Betreff: Re: [R] Split values in vector
> Hi, Johannes,
>
> maybe
>
> X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE))
> X <- strsplit( X, split = "-", fixed = TRUE)
>
> X <- sapply( X, function( x)
> if( length(x) == 2)
> rep( x[1], as.numeric( x[2])) else x[1]
> )
>
> table(X, useNA = "always")
>
>
> comes close to what you want.
>
> Hth -- Gerrit
>
>
> On Thu, 19 Jan 2012, Johannes Radinger wrote:
>
> > Hello,
> >
> > I have a vector which looks like
> >
> > x$ART
> > ...
>
> > [35415] 00 01-1;02-1;05-1;
> > [35417] 01-1; 01-1;02-1;
> > [35419] 01-1; 00
> > [35421] 01-1;04-1; 05-1;
> > [35423] 02-1; 01-1;02-1;
> > [35425] 01-1;02-1; <NA>
> > [35427] 01-1; <NA>
> > ...
> >
> >
> > This is a vector I got in this format. To explain it:
> > there are several categories (00,01,02 etc) and its counts (values after
> -)
> > So I have to split each value and create new dataframe-columns/vectors
> > for each categories one column and the value should be then in the
> > corresponding cell. I know that this vector has 7 categories (00-06)
> > and NA values but each case (row) has not all the categories (as you can
> see). How can do such as split?
> >
> > In the end I should get:
> > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA>
> all the categories should have also <NA>.
> >
> > Maybe someone can help.
> >
> > Thank you,
> >
> > Best regards
> >
> > Johannes
> >
> >
> >
> > --
> > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
More information about the R-help
mailing list