Flagging spots (was: [BioC] Bioconductor documentation
Gordon Smyth
smyth at wehi.edu.au
Wed Sep 1 02:33:26 CEST 2004
At 05:28 AM 1/09/2004, you wrote:
>The reason that we want to read in more columns is to create the
>flags. Some people think that spots should be flagged if (e.g.) mean(Rf)
>differs considerably from median(Rf), or the s.d. of one of these measures
>is large. Right now, they need to create the flags outside of Bioconductor.
I think you might want something like
myfun <- function(x) {
okred <- abs(x[,"F635 Median"]-x[,"F635 Mean"]) < 50
okgreen <- abs(x[,"F532 Median"]-x[,"F532 Mean"]) < 50
as.numeric(okgreen & okred)
}
RG <- read.maimages(files, source="genepix", wt.fun=myfun)
Then all the "bad" spots will get weight zero which, in limma, is
equivalent to flagging them out. You can proceed with
RG$printer <- getLayout(RG$genes)
RG <- backgroundCorrect(RG) # gives more correction options
MA <- normalizeWithinArrays(RG)
to do print-tip loess normalization in which the flagged spots have no
influence on the normalization.
Gordon
>--Naomi
>
>At 09:28 AM 8/31/2004 +1000, you wrote:
>>At 11:33 PM 30/08/2004, Naomi Altman wrote:
>>>The vignettes are great - perhaps I should not call them
>>>"tutorials". But like other documentation of this type (the book "SAS
>>>for Mixed Models" comes to mind), it is hard to generalize from the
>>>examples. We need both the vignettes and the internal
>>>documentation. We need good but explicit defaults for the general user,
>>>and the option to change these defaults for the expert user.
>>>
>>>Here is an example where the documentation is OK, but the option to
>>>change the defaults is too limited.
>>>
>>>Both limma and marray allow the user read only a limited set of columns
>>>from gpr and spot files. Why not have this as the default, and let the
>>>user decide if they want to read in other columns? Some of my clients
>>>like to filter spots based on quantities like the difference between the
>>>median and mean spot intensity, the sd of intensity, etc. They
>>>currently need to flag spots before importing into Bioconductor because
>>>they cannot read these other columns readily into an marrayRaw object.
>>
>>The wt.fun argument to read.maimages() function in limma already provides
>>the capability to filter or weights spots based on any number of columns
>>in the original file. So there no need to read in the extra columns or to
>>flag spots before importing. The computation of the flags is done at the
>>time of import.
>>
>>The help document for read.maimages() says:
>> Spot quality weights may be extracted from the image analysis
>> files using a ready-made or a user-supplied weight function
>> 'wt.fun'. 'wt.fun' may be any user-supplied function which accepts
>> a data.frame argument and returns a vector of non-negative
>> weights. The columns of the data.frame are as in the image
>> analysis output files. See 'QualityWeights' for provided weight
>> functions.
>>
>>I admit that this is brief, but it does seem explicit.
>>
>>I know that reading in extra columns can be convenient for other
>>purposes. The reason why I decided not to implement this in limma was
>>explained in a post to this list on 22 July:
>>https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-July/005434.html
>>
>>Gordon
>>
>>>--Naomi
>>>
>>>Naomi S. Altman 814-865-3791 (voice)
>>>Associate Professor
>>>Bioinformatics Consulting Center
>>>Dept. of Statistics 814-863-7114 (fax)
>>>Penn State University 814-865-1348 (Statistics)
>>>University Park, PA 16802-2111
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>Naomi S. Altman 814-865-3791 (voice)
>Associate Professor
>Bioinformatics Consulting Center
>Dept. of Statistics 814-863-7114 (fax)
>Penn State University 814-865-1348 (Statistics)
>University Park, PA 16802-2111
More information about the Bioconductor
mailing list