[BioC] makecdfenv, bg.correct

Ariel Chernomoretz ariel.chernomoretz at crchul.ulaval.ca
Mon Jul 18 17:02:41 CEST 2005


Hi Justin,

Some time ago I posted some code that you may find useful.
Basically it was a function to remove a list of probes and/or 
probesets from  CDF and Probe environments. Usual calls to any affy 
processing method performed afterwards, will disregard
the 'offending' probes.

It was originally intended to handle cdf and probes environments at the
same time. I slightly modified the code to allow you working only
with the cdf environment (as I think it's your interest). I am including here 
the updated R code that should be sourced. 
Please refer also to previous posts
 http://files.protsuggest.org/biocond/html/7350.html 
 http://files.protsuggest.org/biocond/html/7366.html
 http://files.protsuggest.org/biocond/html/7367.html

Regards,

Ariel./

# removeProbes v 0.2
#   Now it is not necessary to provide a probe environment name.
#   If probepackagename=NULL only the cdf environment is (re)-loaded
#   with ResetEnvir, and no attempt to modify a probe environment is performed
#   in RemoveProbes
#   (of course, probe env SHOULD be modified if gcrma processing is wanted)


#
# ResetEnvir
# goal:Detach, if necessary, cdf and probe environments,
#      and (re)load them.
# in  :
#      cdfpackagename   (e.g. "hgu95av2cdf")
#      probepackagename (e.g. "hgu95av2probe")
# out : ---
#
ResetEnvir<-function(cdfpackagename,probepackagename){
 ll<-search()
 cdfpackagepos <- grep(cdfpackagename,ll)
 if(length(cdfpackagepos)>0) detach(pos=cdfpackagepos)
 library(cdfpackagename,character.only=T)

 if(!is.null(probepackagename)){
  ll<-search()
  probepackagepos <- grep(probepackagename,ll)
  if(length(probepackagepos)>0) detach(pos=probepackagepos)
  library(probepackagename,character.only=T)
 }
}


#
# RemoveProbes
# goal: Modification of CDF and PROBE  environments
# in  :
#       listOutProbes: list of probes to be removed from cdf and probe
#                      environments (e.g. c("1001_at1","1032_at6").
#                      If NULL no probe will be taken out.
#       listOutProbeSets: list of probesets to be removed from cdf and probe
#                         environments (e.g. c("1006_at","1032_f_at").
#                         If NULL no probeset will be removed.
#       cdfpackagename:   (e.g. "hgu95av2cdf")
#       probepackagename: (e.g. "hgu95av2probe")
#       destructive: unimplemented option, see NOTE
#
# out : ---
#
# NOTE 1: After a call to RemoveProbes the probenames reported by
#       pm and mm accessing functions implemented in the affy package, will
#       be in general differents from the original ones in probesets where
#       probes have been removed. This happens as in this functions the probe 
names
#       are always assigned sequentially.
#       RemoveProbes modifies the specified CDF and PROBE environments
#       in a consistent BUT destructive way. Take this in consideartion if
#       your code relays on absolute references to probe names.
#
#       A chunk of code to illustrate this
#
#         library(affy)
#         library(affydata)
#         source("removeProbes.R")
#
#         data(Dilution)
#         Dilution at cdfName<-"hgu95av2" # fix cdf name
#
#         cleancdf         <- cleancdfname(Dilution at cdfName,addcdf=FALSE)
#         cdfpackagename   <- paste(cleancdf,"cdf",sep="")
#         probepackagename <- paste(cleancdf,"probe",sep="")
#
#         ResetEnvir(cdfpackagename,probepackagename)
#         pm(Dilution,"1000_at")
#         as.data.frame(get(probepackagename))[1:16,1:4]
#
#         RemoveProbes(c("1000_at2"),NULL,cdfpackagename,probepackagename)
#         pm(Dilution,"1000_at")
#         as.data.frame(get(probepackagename))[1:15,1:4]
#
# NOTE2 : See my April 20 post to BioC mailing list pots (and eventually its
#         continuation) regarding differences reported
#         between GCRMA 1.1.0 and GCRMA 1.1.3
#
RemoveProbes<-function(listOutProbes=NULL,
                       listOutProbeSets=NULL,
                       cdfpackagename,probepackagename,destructive=TRUE){

 #default probe dataset values

 if(!is.null(probepackagename)){
  probe.env.orig <- get(probepackagename)
 }

 if(!is.null(listOutProbes)){
  # taking probes out from CDF env
  probes<- unlist(lapply(listOutProbes,function(x){
                           a<-strsplit(x,"at")
                           aux1<-paste(a[[1]][1],"at",sep="")
                           aux2<-as.integer(a[[1]][2])
                           c(aux1,aux2)
                          }))
  n1<-as.character(probes[seq(1,(length(probes)/2))*2-1])
  n2<-as.integer(probes[seq(1,(length(probes)/2))*2])
  probes<-data.frame(I(n1),n2)
                                                  
  probes[,1]<-as.character(probes[,1])
  probes[,2]<-as.integer(probes[,2])
  pset<-unique(probes[,1])
  for(i in seq(along=pset)){
   ii  <-grep(pset[i],probes[,1])
   iout<-probes[ii,2]
   a<-get(pset[i],env=get(cdfpackagename))
   a<-a[-iout,]
   assign(pset[i],a,env=get(cdfpackagename))
  }
 }

 # taking probesets out from CDF env
 if(!is.null(listOutProbeSets)){
  rm(list=listOutProbeSets,envir=get(cdfpackagename))
 }

 # setting the PROBE env accordingly (idea from gcrma compute.affinities.R)
 if(!is.null(probepackagename)){
  tmp <- get("xy2i",paste("package:",cdfpackagename,sep=""))
  newAB   <- new("AffyBatch",cdfName=cleancdf)
  pmIndex <-  unlist(indexProbes(newAB,"pm"))
  subIndex<- match(tmp(probe.env.orig$x,probe.env.orig$y),pmIndex)
  rm(newAB)
  iNA     <- which(is.na(subIndex))

  if(length(iNA)>0){
   ipos<-grep(probepackagename,search())
   assign(probepackagename,probe.env.orig[-iNA,],pos=ipos)
  }
 }
}









On July 15, 2005 03:45 pm, Justin Borevitz wrote:
> Hi Ben et al.
>
> We're working  with new tiling arrays for Arabidopsis and drosophila.  We
> can create a new cdfenv for the arrays and  would like  to read in new 
> .CEL files to execute bg.correct and  normalize.quantiles.  We would like
> to mask many features that are  not unique in the  genome and are
> considering using the rm.extra option in read.affybatch.  How should this
> be specified?  Can we feed a matrix of T/F of dim(array.size,array.size)
> for which features to use?  How are these setting specified externally?
>
> rm.mask = FALSE, rm.outliers = FALSE, rm.extra = FALSE
>
> Another option we considered is to use bg.correct on a matrix of cel
> intensities rather  than on an affy.batch, how could I call
> bg.correct(matrixofCelintensities)?
>
> Thanks,
> Justin
>
> -----
> Justin Borevitz
> Ecology and Evolution (CLSC 915E)
> University of Chicago
> 1101 E. 57th St.
> Chicago, IL 60637 USA
> (773) 702-5948 office
> (773) 834-4055 lab
> borevitz at uchicago.edu
> http://naturalvariation.org
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor

-- 
Ariel Chernomoretz, Ph.D.
Centre de recherche du CHUL
2705 Blv Laurier, bloc T-367
Sainte-Foy, Qc
G1V 4G2
(418)-525-4444 ext 46339



More information about the Bioconductor mailing list