[R] Reformatting text inside a data frame
Jon BR
jonsleepy at gmail.com
Mon Sep 7 21:27:05 CEST 2015
Hi all,
I've read in a large data frame that has formatting similar to the one
in the small example below:
df <-
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
names(df) <- c("rowNum","first","second")
> df
rowNum first second
1 1 <NA> AD=13;BA=49
2 2 AD=2;BA=8 AD=1;BA=2
3 3 AD=9;BA=1 <NA>
I'd like to reformat all of the non-NA entries in df from "first" and
"second" and so-on such that "AD=13;BA=49" will be replaced by the
following string: "13_13-49".
So applied to df, the output would be the following:
rowNum first second
1 1 <NA> 13_13-49
2 2 2_2-8 1_1-2
3 3 9_9-1 <NA>
I'm generally a big proponent of shell scripting with awk, but I'd prefer
an all-R solution if one exists (and also to learn how to do this more
generally).
Could someone point out an appropriate paradigm or otherwise point me in
the right direction?
Best,
Jonathan
[[alternative HTML version deleted]]
More information about the R-help
mailing list