[R] Split values in vector
Johannes Radinger
JRadinger at gmx.at
Fri Jan 20 13:49:29 CET 2012
Hello all,
I think I am now on the way to correctly split the vector as I want it
using for loops.
I got now to a point where I got stucked....So maybe someone can help
me out...
Remember the result I am looking for should look like (for
the input vector I want to split see below: var3)
var1 var2 var3_00 var3_01 var3_02 var3_04
1 A 1 0 0 0
2 B 0 1 3 1
3 C 0 2 1 0
4 D 0 0 0 12
5 E NA NA NA NA
The input and my approach so far:
It is probably not the most elegant solution but I think I will get
where I want..I am very open for your improvements:
var1 <-seq(1,5)
var2 <-c("A","B","C","D","E")
var3 <-c("00","01-1;02-3;04-1","01-2;02-1","01-0;04-2",NA)
x <- data.frame(var1,var2,var3)
#create new columns and prefill with 0
x$var3_01 <- 0
x$var3_02 <- 0
x$var3_03 <- 0
x$var3_04 <- 0
a <- strsplit(as.character(x$var3), split = ";", fixed = TRUE)
for (i in 1:length(a)){
A <- length(a[[i]])
for (j in 1:A){
column <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[1])
if(column!="00"){
value <- (unlist(strsplit((a[[i]][j]), split="-",fixed=TRUE))[2])
print(column)
print(value)
if(is.na(column)) {
x$var3_01[i] <- NA
x$var3_02[i] <- NA
x$var3_03[i] <- NA
x$var3_04[i] <- NA
} else
if(column %in% c("01","02","03","04")) {
#print(paste("x$var3_",column,sep=""))
(paste("x$var3_",column,sep=""))[i]<- as.numeric(value)
} else print("Problem with category")
}
}
}
I think there is a problme with (paste("x$var3_",column,sep=""))[i]
which is not recognized correctly as it is interpreted as a string.
Thank you...
best regards,
/johannes
-------- Original-Nachricht --------
> Datum: Thu, 19 Jan 2012 13:42:24 +0100 (MET)
> Von: Gerrit Eichner <Gerrit.Eichner at math.uni-giessen.de>
> An: Johannes Radinger <JRadinger at gmx.at>
> CC: R-help at r-project.org
> Betreff: Re: [R] Split values in vector
> Hi, Johannes,
>
> maybe
>
> X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE))
> X <- strsplit( X, split = "-", fixed = TRUE)
>
> X <- sapply( X, function( x)
> if( length(x) == 2)
> rep( x[1], as.numeric( x[2])) else x[1]
> )
>
> table(X, useNA = "always")
>
>
> comes close to what you want.
>
> Hth -- Gerrit
>
>
> On Thu, 19 Jan 2012, Johannes Radinger wrote:
>
> > Hello,
> >
> > I have a vector which looks like
> >
> > x$ART
> > ...
>
> > [35415] 00 01-1;02-1;05-1;
> > [35417] 01-1; 01-1;02-1;
> > [35419] 01-1; 00
> > [35421] 01-1;04-1; 05-1;
> > [35423] 02-1; 01-1;02-1;
> > [35425] 01-1;02-1; <NA>
> > [35427] 01-1; <NA>
> > ...
> >
> >
> > This is a vector I got in this format. To explain it:
> > there are several categories (00,01,02 etc) and its counts (values after
> -)
> > So I have to split each value and create new dataframe-columns/vectors
> > for each categories one column and the value should be then in the
> > corresponding cell. I know that this vector has 7 categories (00-06)
> > and NA values but each case (row) has not all the categories (as you can
> see). How can do such as split?
> >
> > In the end I should get:
> > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA>
> all the categories should have also <NA>.
> >
> > Maybe someone can help.
> >
> > Thank you,
> >
> > Best regards
> >
> > Johannes
> >
> >
> >
> > --
> > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
More information about the R-help
mailing list