[R] Character (1a, 1b) to numeric
Jean-Louis Abitbol
@b|tbo| @end|ng |rom @ent@com
Fri Jul 10 22:19:05 CEST 2020
Many thanks to all. This help-list is wonderful.
I have used Rich Heiberger solution using match and found something to learn in each answer.
off topic, I also enjoyed very much his 2008 paper on the graphical presentation of safety data....
Best wishes.
On Fri, Jul 10, 2020, at 10:02 PM, Fox, John wrote:
> Hi,
>
> We've had several solutions, and I was curious about their relative
> efficiency. Here's a test with a moderately large data vector:
>
> > library("microbenchmark")
> > set.seed(123) # for reproducibility
> > x <- sample(xc, 1e4, replace=TRUE) # "data"
> > microbenchmark(John = John <- xn[x],
> + Rich = Rich <- xn[match(x, xc)],
> + Jeff = Jeff <- {
> + n <- as.integer( sub( "[a-i]$", "", x ) )
> + d <- match( sub( "^\\d+", "", x ), letters[1:9] )
> + d[ is.na( d ) ] <- 0
> + n + d / 10
> + },
> + David = David <- as.numeric(gsub("a", ".3",
> + gsub("b", ".5",
> + gsub("c", ".7", x)))),
> + times=1000L
> + )
> Unit: microseconds
> expr min lq mean median uq max neval cld
> John 228.816 345.371 513.5614 503.5965 533.0635 10829.08 1000 a
> Rich 217.395 343.035 534.2074 489.0075 518.3260 15388.96 1000 a
> Jeff 10325.471 13070.737 15387.2545 15397.9790 17204.0115 153486.94 1000 b
> David 14256.673 18148.492 20185.7156 20170.3635 22067.6690 34998.95 1000 c
> > all.equal(John, Rich)
> [1] TRUE
> > all.equal(John, David)
> [1] "names for target but not for current"
> > all.equal(John, Jeff)
> [1] "names for target but not for current" "Mean relative difference:
> 0.1498243"
>
> Of course, efficiency isn't the only consideration, and aesthetically
> (and no doubt subjectively) I prefer Rich Heiberger's solution. OTOH,
> Jeff's solution is more general in that it generates the correspondence
> between letters and numbers. The argument for Jeff's solution would,
> however, be stronger if it gave the desired answer.
>
> Best,
> John
>
> > On Jul 10, 2020, at 3:28 PM, David Carlson <dcarlson using tamu.edu> wrote:
> >
> > Here is a different approach:
> >
> > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", xc))))
> > xn
> > # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> >
> > David L Carlson
> > Professor Emeritus of Anthropology
> > Texas A&M University
> >
> > On Fri, Jul 10, 2020 at 1:10 PM Fox, John <jfox using mcmaster.ca> wrote:
> > Dear Jean-Louis,
> >
> > There must be many ways to do this. Here's one simple way (with no claim of optimality!):
> >
> > > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > >
> > > set.seed(123) # for reproducibility
> > > x <- sample(xc, 20, replace=TRUE) # "data"
> > >
> > > names(xn) <- xc
> > > z <- xn[x]
> > >
> > > data.frame(z, x)
> > z x
> > 1 2.5 2b
> > 2 2.5 2b
> > 3 1.5 1b
> > 4 2.3 2a
> > 5 1.5 1b
> > 6 1.3 1a
> > 7 1.3 1a
> > 8 2.3 2a
> > 9 1.5 1b
> > 10 2.0 2
> > 11 1.7 1c
> > 12 2.3 2a
> > 13 2.3 2a
> > 14 1.0 1
> > 15 1.3 1a
> > 16 1.5 1b
> > 17 2.7 2c
> > 18 2.0 2
> > 19 1.5 1b
> > 20 1.5 1b
> >
> > I hope this helps,
> > John
> >
> > -----------------------------
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > Web: http::/socserv.mcmaster.ca/jfox
> >
> > > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abitbol using sent.com> wrote:
> > >
> > > Dear All
> > >
> > > I have a character vector, representing histology stages, such as for example:
> > > xc <- c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > >
> > > and this goes on to 3, 3a etc in various order for each patient. I do have of course a pre-established classification available which does change according to the histology criteria under assessment.
> > >
> > > I would want to convert xc, for plotting reasons, to a numeric vector such as
> > >
> > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > >
> > > Unfortunately I have no clue on how to do that.
> > >
> > > Thanks for any help and apologies if I am missing the obvious way to do it.
> > >
> > > JL
> > > --
> > > Verif30042020
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > and provide commented, minimal, self-contained, reproducible code.
>
>
--
Verif30042020
More information about the R-help
mailing list