[R] intervals from cut() as numerics?

Gabor Grothendieck ggrothendieck at gmail.com
Sat May 20 16:00:58 CEST 2006


One can simplify this slightly using strapply from the gsubfn package.
Given groups, this will create interv.  strapply applies the indicated
function, as.numeric, to each matched pattern, i.e. to each string
that represents a number, producing a list of vectors.  Then we rbind
those vectors together:

library(gsubfn)
interv <- do.call("rbind", strapply(levels(groups), "[[:digit:].]+",
as.numeric))


On 5/20/06, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
> On Sat, 2006-05-20 at 17:39 +0800, Berwin A Turlach wrote:
> > G'day Gavin,
> >
> > >>>>> "GS" == Gavin Simpson <gavin.simpson at ucl.ac.uk> writes:
> >
> >     GS> The problem is getting the range/interval for each group from
> >     GS> (4,4.3], so I can automate this.
> > Most likely there is an easier way, but this seems to work:
> >
> > ## get the levels of groups:
> > > tmp <- levels(groups)
> > ## remove the opening "(" and closing "]" from the string:
> > > tmp1 <- sapply(tmp, function(x) substr(x, 2, nchar(x)-1))
> > ## split into two character strings:
> > > tmp2 <- strsplit(tmp1, ",")
> > ## turn into results into two numbers:
> > > tmp3 <- lapply(tmp2, as.numeric)
> >
> > ## Of course, we can do everything in one go:
> > > lapply(strsplit(sapply(levels(groups), function(x) substr(x, 2, nchar(x)-1)), ","), as.numeric)
>
> Many thanks Berwin. My brain wasn't in character string processing mode,
> but your solution works just fine. For the archives then, here is the
> full script:
>
> ## example data
> dat <- seq(4, 7, by = 0.05)
> x <- sample(dat, 30)
> y <- sample(dat, 30)
> ## residuals
> error <- x - y
> ## break range of x into 10 groups
> groups <- cut(x, breaks = 10)
> ##calculate bias (mean) per group
> max.bias <- aggregate(error, list(group = groups), mean)$x
> ## turn cut intervals into numeric
> interv <- lapply(strsplit(sapply(levels(groups),
>                                 function(x) substr(x, 2,
>                                                    nchar(x)-1)), ","),
>                 as.numeric)
> ## reformat cut intervals as 2 col matrix for easy plotting
> interv <- matrix(unlist(interv), ncol = 2, byrow = TRUE)
> ## plot the residuals vs observed
> plot(x, error, type = "n")
> abline(h = 0, col = "grey")
> panel.smooth(x, error)
> ## add bias indicators per group
> arrows(interv[,1], max.bias, interv[,2], max.bias,
>       length = 0.05, angle = 90, code = 3)
>
> All the best,
>
> G
>
> <snip />
> > Cheers,
> >
> >         Berwin
> >
> > ========================== Full address ============================
> > Berwin A Turlach                      Tel.: +61 (8) 6488 3338 (secr)
> > School of Mathematics and Statistics        +61 (8) 6488 3383 (self)
> > The University of Western Australia   FAX : +61 (8) 6488 1028
> > 35 Stirling Highway
> > Crawley WA 6009                e-mail: berwin at maths.uwa.edu.au
> > Australia                        http://www.maths.uwa.edu.au/~berwin
> >
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  *Note new Address and Fax and Telephone numbers from 10th April 2006*
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Gavin Simpson                     [t] +44 (0)20 7679 0522
> ECRC                              [f] +44 (0)20 7679 0565
> UCL Department of Geography
> Pearson Building                  [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street
> London, UK                        [w] http://www.ucl.ac.uk/~ucfagls/cv/
> WC1E 6BT                          [w] http://www.ucl.ac.uk/~ucfagls/
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list