[R] names in R list's

William Dunlap wdunlap at tibco.com
Tue Sep 8 18:02:35 CEST 2015


It is not too hard to set up some tests to show time
as a function of number of named elements for lists
and environments.  Here is one such test

test <- function (data, nToAdd, nToExtract = length(data))
{
    addTime <- {
        addedNames <- paste0("D", seq(length(data) + 1, len = nToAdd))
        system.time(for (name in addedNames) data[[name]] <- name)
    }
    extractTime <- {
        names <- sample(names(data), size = nToExtract, replace = TRUE)
        system.time(for (name in names) tmp <- data[[name]])
    }
    rbind(addTime, extractTime)[, 1:3]
}

The times for adding and extracting data is pretty linear for environments,
at least up to a size of 10^5:
  > test(new.env(), nToAdd=1e4, nToExtract=5e4)
              user.self sys.self elapsed
  addTime          1.44        0    1.44
  extractTime      9.30        0    9.30
  > test(new.env(), nToAdd=2e4, nToExtract=10e4)
              user.self sys.self elapsed
  addTime          2.87        0    2.88
  extractTime     18.53        0   18.55
  > test(new.env(), nToAdd=1e5, nToExtract=5e5)
              user.self sys.self elapsed
  addTime         14.31        0   14.32
  extractTime     91.95        0   91.96


but is noticeably quadratic for lists at 10^4 elements:
  > test(list(), nToAdd=1e4, nToExtract=5e4)
              user.self sys.self elapsed
   addTime          1.70        0    1.70
   extractTime      2.23        0    2.23
  > test(list(), nToAdd=2e4, nToExtract=10e4)
              user.self sys.self elapsed
  addTime          5.81     0.02    5.82
  extractTime      9.58     0.00    9.58
  > test(list(), nToAdd=1e5, nToExtract=5e5)
              user.self sys.self elapsed
  addTime        143.21     0.04  143.29
  extractTime    255.70     0.00  255.72

For your application you may be interested in timing replacement operations
as will.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Sep 8, 2015 at 4:53 AM, Witold E Wolski <wewolski at gmail.com> wrote:

> Hi Jeff,
>
> Indeed there was something about plain-text in the r-help posting
> guide although I can't find it there anymore.
> https://www.r-project.org/posting-guide.html
>
> Is it still an requirement?
>
> Jeff, thanks for you constructive contribution ;) . Glad that you know
> about plain text mode in e-mails, beside doing some perl programming.
> I forgot about both. Even the linux admin's I know use python and
> thunderbird or some webmail nowadays not pine and perl, but I do not
> much networking, so what do I know.
>
> I think the question I am asking is legitimate. The access complexity
> of datastructures is specified in the documentation in case of python
> datastructures,  java collections or stl containers.
> I guess this information is available for name access on R-list but I
> just can't find it.
>
>
>
>
> regards
>
> On 7 September 2015 at 16:37, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
> > You puzzle me. Why does someone who cannot figure out how to post an
> email in plain text after so many messages on this mailing list get all
> worried about access time for string indexing?
> >
> > Environment objects have those properties. They do not solve all
> problems though, because they are rather heavyweight... you need a lot of
> lookups to pay for their overhead. R5 objects and the hash package both use
> them, but I have never found three need to use them. Yes, I do program in
> Perl so I know where you are coming from, but the vector-based name lookup
> used in R works quite effectively for data where the number of list items
> is short or where I plan to access every element as part of my data
> processing anyway.
> >
> ---------------------------------------------------------------------------
> > Jeff Newmiller                        The     .....       .....  Go
> Live...
> > DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
> Go...
> >                                       Live:   OO#.. Dead: OO#..  Playing
> > Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> > /Software/Embedded Controllers)               .OO#.       .OO#.
> rocks...1k
> >
> ---------------------------------------------------------------------------
> > Sent from my phone. Please excuse my brevity.
> >
> > On September 7, 2015 3:34:53 AM PDT, Witold E Wolski <wewolski at gmail.com>
> wrote:
> >>What is the access time for R lists given a name of list element, is it
> >>linear, log, or constant?
> >>
> >>Than what are to rules for names in R-lists
> >>
> >>That reusing names is possible makes me wonder.
> >>
> >>tmp <- as.list(c(1,2,3,4))
> >>names(tmp) = c("a","a","b","b")
> >>tmp
> >>tmp$a
> >>
> >>
> >>What I am looking for is a standard R data structure which will allow
> >>me
> >>for fast and name lookup.
> >
>
>
>
> --
> Witold Eryk Wolski
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list