Another question Re: [BioC] question on hgu95 metadata and ontoTools

Elisabetta Manduchi manduchi at pcbi.upenn.edu
Tue Jan 13 19:33:34 MET 2004


Hi Vince,
thanks a lot for your suggestion, which I'll try.
I had sent another email on this thread this morning which I'm not sure 
went through (albeit I see it in the thread archive on the web). Namely I 
thought of replacing my command:

obs<-ls(env=hgu95GO)

(which limited my observations to the 12625 probe sets on av2), with

obs<-hgu95.probeset

where the latter is the union of the probesets for all 5 chips. Then 
proceeding as before:

> ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO)

Things seemed to work as desired, at list in the short run. But from what 
you say I gather it might be dangerous to use environments in this way 
and using lists in the way you indicate might in any case be safer in the 
longer run.
Thanks again,
Elisabetta

On Tue, 13 Jan 2004, Vincent Carey 525-2265 wrote:

> 
> >
> > Hi,
> > I'm writing with a questions that follows up on a previous correspondence
> > I had within this mailing list (copied below).
> > Essentially I'm trying to use ontoTools to ge a mapping between the union
> > of the probe sets on the HGU95 av2, b, c, d, e and GO biological process
> > terms.
> > To this end, I've first created an environment, named hgu95GO, which was
> > defined as described below, using parent.env, following a suggestion by R.
> > Gentleman:
> >
> > > hgu95GO<-hgu95av2GO
> > > parent.env(hgu95GO)<-hgu95bGO
> > > parent.env(hgu95bGO)<-hgu95cGO
> > > parent.env(hgu95cGO)<-hgu95dGO)
> > > parent.env(hgu95dGO)<-hgu95eGO
> >
> > For this we have
> >
> > > length(ls(env=hgu95GO))
> > [1] 12625
> >
> > which is the same as the length for hgu95av2GO, rather then the length of
> > the union of the 5 probe set collections from av2 to e (which is 62906).
> > However if I look for a value of a key corresponding to a probe set from
> > b, c,... indeed it gives me something by looking at the parent. I'm not
> > too familiar with environments in R, but I guess this is the expected
> > behavior. Now, I've built a mapping:
> >
> > ooMapHgu952GOBP<-otkvEnv2namedSparse(obs, tms, hgu95GO)
> 
> the otkv function does use ls() and will be stymied by
> the strange behavior of the environment union created
> above.
> 
> i note in the doc for parent.env that this function is
> regarded as dangerous and may be deprecated.
> 
> i believe that this "one time" operation can be done,
> albeit inelegantly, with a manual approach:
> 
> 1) use the contents function in annotate to convert
> environments to lists
> 2) use listUnion to create union of lists.  i have
> written a function to do this; it basically concatenates
> all uniquely named elements of two lists and forms
> unions of elements that share names in the two lists
> 
> listUnion <- function (x, y)
> {
>     if (is.null(names(x)) || is.null(names(y))) {
>         warning("unnamed lists imply union is concatenation")
>         return(unlist(list(x, y), recurs = FALSE))
>     }
>     comm <- intersect(names(x), names(y))
>     if (length(comm) == 0)
>         return(unlist(list(x, y), recurs = FALSE))
>     u1 <- x[!(names(x) %in% comm)]
>     u2 <- y[!(names(y) %in% comm)]
>     comml <- list()
>     for (i in 1:length(comm)) comml[[comm[i]]] <- union(x[[comm[i]]],
>         y[[comm[i]]])
>     return(unlist(list(u1, comml, u2), recurs = FALSE))
> }
> 
> 3) use list2env to create the environment of the final result
> 
> using code like
> l1 <- contents(hgu95av2GO)
> l2 <- contents(hgu95bGO)
> ...
> l5 <- contents(hgu95eGO)
> LU <- listUnion(l1,l2)
> LU <- listUnion(LU,l3)
> ...
> hgu95GO <- list2env(LU)
> 
> leads to 62906 entries in the resulting environment.  these operations
> concluded very rapidly on my laptop.  whether listUnion is something
> to go along with list2env in Biobase, or other tools for merging
> environments should be provided, are topics open to discussion.
> 
> 

-- 
Elisabetta Manduchi

Computational Biology and Informatics Laboratory
Center for Bioinformatics
University of Pennsylvania
1428 Blockley Hall
423 Guardian Drive
Philadelphia, PA 19104-6021

phone: 215-573-4408
fax: 215 573-3111
email: manduchi at pcbi.upenn.edu
web: http://www.cbil.upenn.edu/~manduchi

---



More information about the Bioconductor mailing list