[BioC] Low-level analysis of custom microarrays

Sat Jan 5 00:25:48 CET 2008

Dear Teresa

if I understand your question correctly (please correct me if not), you 
want to estimate and adjust for a spatially dependent background signal 
(e.g. a "gradient"), and that estimate is not provided by the image 
analysis software.

Doing this well is hard, "well" meaning that you don't just remove 
apparent nuisance trends, but do keep the real, biology-related changes 
in the intensities.

The print-tip normalisation (e.g. in limma, also via the "strata" 
argument of vsn) is often a good proxy for adjusting spatial trends.

You can also try 2D local regression (loess function, or the locfit 
package, or with the OLIN package in Bioconductor)

References:
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1523216
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=126873
http://bioinformatics.oxfordjournals.org/cgi/content/full/21/8/1724

   Best wishes
	Wolfgang

> "Teresa Colombo" <teresa.colombo at gmail.com> writes:
> 
>> Hi Dear BioC list,
>>
>> I am a newbie and apologize in advance if this is just a stupid
>> question. But I've been trapped by this problem since 2 weeks now and
>> dunno how to move on from here without a little help...
>>
>> my TASK: Perform background correction on (miRNA) microarray data from
>> a custom chip, taking into account slide spatial info (no simple
>> subtraction of background intensities).
> 
> Others on the list can speak more knowledgeably than me about these
> things, so please take my input lightly.
> 
> I guess these are two-color Agilent arrays. The data you present below
> is not from the 'gpr' files required to perform 'background
> correction', but from some later point in the analysis that you must
> determine (because the data has likely already had some kinds of data
> transformation applied).
> 
> Background correction usually involves transformations of individual
> spot foreground and background intensities, perhaps taking into
> account some properties of all spots in the array but not usually
> spatial location. Your data do not include foreground and background
> information for each channel (probably some background correction
> method has already been applied), so background correction cannot be
> performed.
> 
> Spatial effects might typically be accommodated by within-array
> normalization. These methods attempt to make the difference in
> (background-corrected) channel intensities ('M' values)
> statistically independent of the average intensities ('A' values) of
> each spot. The usual methods implicitly incorporate spatial variation
> (as a factor contributing to variation in 'A').
> 
> An important assumption is that expression of the majority of spots
> does not differ between channels. This may not be the case for your
> miRNA arrays. miRNAs also likely exhibit significant dye effects, and
> these need to be accommodated.
> 
> A starting point for two-color analyses is the limma package and its
> comprehensive user guide. limma would take you from gpr files through
> background correction, normalization, and assessment of differential
> expression. Though again miRNAs require special consideration.
> 
> Hope that helps.
> 
> Martin
> 
>> my INPUT DATA FORMAT: For each slide, a tab delimited text file
>> carrying the following info:
>> "Probe_ID"      "Row"   "Column"        "Density_mean_{A}"
>> "Density_st.dev._ {A}"
>>
>> For example, the following are the first 5 lines for one of the slides:
>> "empty" 1       1       174,2   8,57
>> "hsa-let-7a"    1       2       49522,89        343,1
>> "hsa-miR-150"   1       3       40738,46        677,54
>> "hsa-miR-204"   1       4       209,61  15,48
>> "hsa-miR-32"    1       5       223,07  15,24
>>
>> There are 7 replicates for each experimental probe + many internal
>> control probes (row.names are not unique).
>>
>> my QUESTION:
>> Is there any R package/function available to perform background
>> correction taking into account the slide design/spatial info (amenable
>> to be used with this kind of raw input data - e.g., neither .CEL nor
>> Illumina input data)?
>>
>> my R version - attached packages:
>>> sessionInfo()
>> R version 2.4.0 Patched (2006-11-25 r39997)
>> i486-pc-linux-gnu
>>
>> locale:
>> LC_CTYPE=it_IT at euro;LC_NUMERIC=C;LC_TIME=it_IT at euro;LC_COLLATE=it_IT at euro;LC_MONETARY=it_IT at euro;LC_MESSAGES=it_IT at euro;LC_PAPER=it_IT at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT at euro;LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] "tools"     "stats"     "graphics"  "grDevices" "utils"     "datasets"
>> [7] "methods"   "base"
>>
>> other attached packages:
>>     affy   affyio  Biobase
>> "1.12.2"  "1.2.0" "1.12.2"
>>
>>
>> Thank you in advance for your help and time!
>>
>> Best,
>> teresa
>>