Another question Re: [BioC] question on hgu95 metadata and ontoTools

Vincent Carey 525-2265 stvjc at channing.harvard.edu
Tue Jan 13 18:40:50 MET 2004


>
> Hi,
> I'm writing with a questions that follows up on a previous correspondence
> I had within this mailing list (copied below).
> Essentially I'm trying to use ontoTools to ge a mapping between the union
> of the probe sets on the HGU95 av2, b, c, d, e and GO biological process
> terms.
> To this end, I've first created an environment, named hgu95GO, which was
> defined as described below, using parent.env, following a suggestion by R.
> Gentleman:
>
> > hgu95GO<-hgu95av2GO
> > parent.env(hgu95GO)<-hgu95bGO
> > parent.env(hgu95bGO)<-hgu95cGO
> > parent.env(hgu95cGO)<-hgu95dGO)
> > parent.env(hgu95dGO)<-hgu95eGO
>
> For this we have
>
> > length(ls(env=hgu95GO))
> [1] 12625
>
> which is the same as the length for hgu95av2GO, rather then the length of
> the union of the 5 probe set collections from av2 to e (which is 62906).
> However if I look for a value of a key corresponding to a probe set from
> b, c,... indeed it gives me something by looking at the parent. I'm not
> too familiar with environments in R, but I guess this is the expected
> behavior. Now, I've built a mapping:
>
> ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO)

the otkv function does use ls() and will be stymied by
the strange behavior of the environment union created
above.

i note in the doc for parent.env that this function is
regarded as dangerous and may be deprecated.

i believe that this "one time" operation can be done,
albeit inelegantly, with a manual approach:

1) use the contents function in annotate to convert
environments to lists
2) use listUnion to create union of lists.  i have
written a function to do this; it basically concatenates
all uniquely named elements of two lists and forms
unions of elements that share names in the two lists

listUnion <- function (x, y)
{
    if (is.null(names(x)) || is.null(names(y))) {
        warning("unnamed lists imply union is concatenation")
        return(unlist(list(x, y), recurs = FALSE))
    }
    comm <- intersect(names(x), names(y))
    if (length(comm) == 0)
        return(unlist(list(x, y), recurs = FALSE))
    u1 <- x[!(names(x) %in% comm)]
    u2 <- y[!(names(y) %in% comm)]
    comml <- list()
    for (i in 1:length(comm)) comml[[comm[i]]] <- union(x[[comm[i]]],
        y[[comm[i]]])
    return(unlist(list(u1, comml, u2), recurs = FALSE))
}

3) use list2env to create the environment of the final result

using code like
l1 <- contents(hgu95av2GO)
l2 <- contents(hgu95bGO)
...
l5 <- contents(hgu95eGO)
LU <- listUnion(l1,l2)
LU <- listUnion(LU,l3)
...
hgu95GO <- list2env(LU)

leads to 62906 entries in the resulting environment.  these operations
concluded very rapidly on my laptop.  whether listUnion is something
to go along with list2env in Biobase, or other tools for merging
environments should be provided, are topics open to discussion.



More information about the Bioconductor mailing list