[BioC] genefilter - construct your own test

Lina Hultin-Rosenberg Lina.Hultin.Rosenberg at ebc.uu.se
Tue Sep 5 15:52:38 CEST 2006


Thanks a lot! I suspected it was something simple but couldn't really figure
it out. /Lina

-----Ursprungligt meddelande-----
Från: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] För James W. MacDonald
Skickat: den 5 september 2006 15:39
Till: Lina Hultin-Rosenberg
Kopia: bioconductor at stat.math.ethz.ch
Ämne: Re: [BioC] genefilter - construct your own test

Hi Lina,

Lina Hultin-Rosenberg wrote:
> Dear list!
> 
> I have a question on the package genefilter. Reading the R documentation
on
> genefilter I understand that the user can construct his/her own tests but
I
> don't really understand how. 
> 
> "This package uses a very simple but powerful protocol for filtering
genes.
> The user simply constructs any number of tests that they want to apply. A
> test is simply a function (as constructed using one of the many helper
> functions in this package) that returns TRUE if the gene of interest
passes
> the test (or filter) and FALSE if the gene of interest fails." 
> 
> Is it possible to construct your own tests for use in genefilter and how
is
> that done? I would like to filter genes on absent/present calls from the
> mas5calls method, perhaps in combination with other filters, so it would
be
> very convenient to include that test in genefilter.
> 
> Maybe someone has experience from constructing there own filters and could
> point me in the right direction. I would greatly appreciate some help!

It's actually very simple. An example would be the kOverA() function 
that already exists in genefilter:

 > kOverA
function (k, A = 100, na.rm = TRUE)
{
     function(x) {
         if (na.rm)
             x <- x[!is.na(x)]
         sum(x > A) >= k
     }
}

So let's say you want to select probesets based on having a 'present' 
call in n or more samples. You could set up your filter function like this:

mascallsfilter <- function(cutoff = "p", number){
   function(x){
     ## use tolower() to normalize inputs
     sum(tolower(x) == tolower(cutoff)) >= number
   }
}

Note that there are two functions here, one nested in the other. The 
outer function takes the arguments to filter on, and the inner one takes 
an argument 'x', which will be your matrix of calls. You can then use 
this function like any other in the genefilter package:

 > f1 <- mascallsfilter("p", 5)
 > filt <- filterfun(f1)
 > a <- matrix(sample(c("P","M","A"), 100, TRUE), nc=5)
 > a
       [,1] [,2] [,3] [,4] [,5]
  [1,] "M"  "P"  "A"  "M"  "M"
  [2,] "A"  "P"  "P"  "A"  "A"
  [3,] "M"  "A"  "A"  "P"  "P"
  [4,] "M"  "M"  "M"  "A"  "M"
  [5,] "A"  "P"  "P"  "P"  "P"
  [6,] "M"  "P"  "M"  "M"  "M"
  [7,] "A"  "M"  "P"  "M"  "A"
  [8,] "A"  "P"  "P"  "P"  "A"
  [9,] "A"  "M"  "P"  "M"  "A"
[10,] "P"  "A"  "M"  "A"  "M"
[11,] "P"  "M"  "A"  "A"  "A"
[12,] "M"  "A"  "A"  "M"  "P"
[13,] "P"  "A"  "A"  "A"  "A"
[14,] "M"  "A"  "A"  "A"  "M"
[15,] "A"  "A"  "A"  "A"  "A"
[16,] "A"  "A"  "P"  "M"  "M"
[17,] "M"  "M"  "P"  "A"  "M"
[18,] "M"  "P"  "M"  "P"  "M"
[19,] "P"  "P"  "P"  "A"  "A"
[20,] "M"  "M"  "P"  "A"  "M"
 > genefilter(a, filt)
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
FALSE FALSE FALSE FALSE
[16] FALSE FALSE FALSE FALSE FALSE

HTH,

Jim


> 
> Thank you!
> 
> Sincerely
> 
> Lina Hultin Rosenberg
> 
> ________________________________
> Lina Hultin Rosenberg
> Msc Molecular Biotechnology
> Evolutionary Biology Department
> Uppsala University
> Norbyvägen 18
> 752 36 Uppsala
> Phone: +46-18-4716444
> Email: lina.hultin.rosenberg at ebc.uu.se
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be
used for urgent or sensitive issues.

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list