[BioC] Creating a customized cut flow with the affy/limma packages

James W. MacDonald jmacdon at uw.edu
Mon Feb 13 16:04:24 CET 2012


Hi J.M.,

On 2/13/2012 9:35 AM, J.M.Jensen [guest] wrote:
> Dear all,
> I am trying to create a customized cut flow with the affy/limma packages. In order to assign different cut-off values between different arrays would be to first make one cut-off and then use the remaining expression set in further analysis. When working with the affy/limma packages, I read and normalize my data with the justRMA() function, fit a linear model with the limma package and use the decideTests() from the limma package to select probes with a specified cut-off (e.g. lfc). The problem is that I cannot assign multiple criteria for the cut flow. For example, if I want to remove all probes from my data set that differ more than a lfc of 1.2 between two controls, regardless of the expression values in the rest of the samples. After this cut, I would then be able to use another cut off value, e.g. lfc=1.8 to select probes from different contrasts of the remaining data. I tried looking into the vennSelect() (affycoretools), because I can use it to select various contrasts
>   and their intersection, however, I cannot figure how to assign different cut off criteria.
> Any help or thoughts on this would be most appreciated.

It seems like you want to do two very different things here, but correct 
me if I misunderstand. First, it appears that you want to remove all 
probes that vary between controls.  Here I am assuming that you have one 
set of controls, and you want to exclude any probeset that varies 
between any two of the controls by 2^1.2 or greater. Second, you want to 
select all probes that vary between various contrasts.

These are very different things, and I don't think there is a 
one-size-fits-all solution. The way I understand it, you won't be able 
to do anything with the fitted object to filter the first set of 
probesets. Instead, you want to use the raw data.

You could set up a function, say filterControls() to do step 1.

filterControls <- function(controlvector, eset, filt){
## control vector is a numeric vector
## eset is an ExpressionSet (e.g., from justRMA())
## filt is a numeric log fold change to filter on
require(gtools)
comb <- combinations(length(controlvector), 2, controlvector)
indlst <- lapply(1:nrow(comb), function(x) abs(eset[,comb[x,1]] - 
eset[,comb[x,2]]) > filt)
ind <- apply(do.call("cbind", indlst), 1, any)
eset <- eset[!ind,]
eset
}

Then you would just go forward with the subsetted ExpressionSet object, 
using decideTests() with whatever lfc you like, and vennSelect() to output.

Best,

Jim


> Best regards,
> J.M. Jensen
>
>   -- output of sessionInfo():
>
> R version 2.13.0 (2011-04-13)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] venneuler_1.1-0      rJava_0.9-3          affycoretools_1.24.0
>   [4] KEGG.db_2.5.0        gplots_2.10.1        KernSmooth_2.23-4
>   [7] caTools_1.12         bitops_1.0-4.1       gdata_2.8.2
> [10] gtools_2.6.2         GO.db_2.5.0          hgu133plus2.db_2.5.0
> [13] org.Hs.eg.db_2.5.0   RSQLite_0.9-4        DBI_0.2-5
> [16] hgu133plus2cdf_2.8.0 annotate_1.30.1      AnnotationDbi_1.14.1
> [19] limma_3.8.3          affy_1.30.0          Biobase_2.12.2
>
> loaded via a namespace (and not attached):
>   [1] affyio_1.20.0         annaffy_1.24.0        biomaRt_2.8.1
>   [4] Biostrings_2.20.4     Category_2.18.0       gcrma_2.24.1
>   [7] genefilter_1.34.0     GOstats_2.18.0        graph_1.30.0
> [10] GSEABase_1.14.0       IRanges_1.10.6        preprocessCore_1.14.0
> [13] RBGL_1.28.0           RCurl_1.6-10.1        splines_2.13.0
> [16] survival_2.36-5       tools_2.13.0          XML_3.4-2.2
> [19] xtable_1.6-0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list