[BioC] Using Limma for proteomics (2D DIGE) datasets

Thu Oct 22 08:04:10 CEST 2009

Hi Tom,

I have checked limmaGUI and I'm not sure why it is not reading your data 
files correctly when you use the Other format option. I shall look into 
that later. However, I suggest, as Elmer and Tobias do, that you read 
the limma Users Guide and use limma at the command line. I have read 
your data files that you attached using the code below. You should be 
able to proceed on from there, though I will leave it to others to 
advise on the statistical approach, as I'm a programmer rather than a 
statistician.

 >library(limma)
 >setwd("wherever you stored your data")
 >targets <- readTargets("Brain_kol56_targets2.txt")
 >RG <- 
read.maimages(columns=list(R="Cy3",Rb="Cy3_b",G="Cy5",Gb="Cy5_b"),ext="spot")

and looking at your RGlist I get:
 >RG
An object of class "RGList"
$R
brain_56_g1 brain_56_g2
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] NA NA
[5,] NA NA
3379 more rows ...

$Rb
brain_56_g1 brain_56_g2
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
[5,] 0 0
3379 more rows ...

$G
brain_56_g1 brain_56_g2
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] NA NA
[5,] NA NA
3379 more rows ...

$Gb
brain_56_g1 brain_56_g2
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
[5,] 0 0
3379 more rows ...

$targets
FileName
brain_56_g1 brain_56_g1
brain_56_g2 brain_56_g2

$source
[1] "generic"

cheers,

Keith

========================
Keith Satterley
Bioinformatics Division
The Walter and Eliza Hall Institute of Medical Research
Parkville, Melbourne,
Victoria, Australia
=======================

Elmer Fernández wrote:
> Hi Tom
> Yes, you can use limma by hand. You should build the data matrix by hand and
> also the design matrix. I already did it and it works fine. The use of Cy2
> is highly controversial, some people sais that it add noise but most of the
> use the Cy2 information, I recommend to do different approaches and use a
> consensus between all of them.
> Above you will find some good references. The first one uses limma
>
> Bes regard
> Elmer
>
>
> Kultima K, Scholz B. Alm H. Sköld K. Svensson M. Crossman AR. Bezard E.
> Andrén PE. Lönnstedt I. : (2006) Normalization and expression changes in
> predefined sets of proteins using 2D gel electrophoresis: A proteomic study
> of
> L-DOPA induced dyskinesia in an animal model of Parkinson’s disease using
> DIGE. BMC Bioinformatics 7:475
>
> *Improving 2D-DIGE protein expression analysis by two-stage linear mixed
> models: Assessing experimental effects in a melanoma cell study** *Fernández
> Elmer A, Girotti María R., López Juan A, Llera Andrea S., Podhajcer Osvaldo
> L, Cantet Rodolfo J. C. and Balzarini Mónica* **Bioinformatics** *Advance
> Access published on September 25, 2008; doi:
> doi:10.1093/bioinformatics/btn508* [
> http://www.ncbi.nlm.nih.gov/pubmed/18818217?ordinalpos=1&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum
> ]*
>
> On Wed, Oct 21, 2009 at 8:33 AM, Tobias Straub
> <tstraub at med.uni-muenchen.de>wrote:
>
>   
>> hi tom,
>>
>> you can easily call limma without having to construct complex objects such
>> as MAList or alike
>> have a look at ?lmFit
>>
>> if you are able to construct a matrix of either cy3/cy5 ratios or simply
>> the individual channels you are on the right way. if you set the rownames of
>> your matrix to your protein names you will even get back a meaningful output
>> from topTable.
>>
>> another alternative would be to use a wrapper around lmFit such as provided
>> in the 'st' package. advantage here is that you can easily switch to other t
>> statistics such as studentt, efront, sam etc.
>>
>> why you think that moderated t is 'better' than student t. any evidences?
>>
>> best
>> T.
>>
>>
>>
>> On Oct 18, 2009, at 2:05 PM, Tom Wenseleers wrote:
>>
>>  Dear all,
>>     
>>> I am interested in using Limma/LimmaGUI for the analysis of proteomics (2D
>>> DIGE) datasets. I have had a try with LimmaGUI, however, I seem to keep on
>>> getting the message "limmaGUI was unable to read in the gene list from the
>>> raw (image analysis) files." (although the Cy3 and Cy5 data are read in
>>> fine) - any idea what I could be doing wrong? Main thing is I don't know how
>>> I should call the gene (well protein) list column... I attach a couple of
>>> the .spot files and the targets file for your info. I chose
>>> File...New...selected my targets file, then Type of Image processing
>>> file...Other Red foreground: Cy5, Red background: Cy5_b (I just put in zeros
>>> - hope that is OK?), Green foreground: Cy3, Green background: Cy3_b.
>>> I have the latest Windows version of R and all the bioconductor packages
>>> installed. (Incidentally, during installation it complained about the sma
>>> package not being there - that appears to be no longer supported - but is
>>> still used in e.g. Limma, so this I guess needs sorting out - I installed an
>>> older archived version).
>>>
>>> In proteomics we also have the Cy2 channel which is an internal control
>>> based on a pooled sample that is identical for all gels - but I think I do
>>> not have to use that since I am interested in the Cy3/Cy5 Log ratios and
>>> (Cy3/Cy2) / (Cy5/Cy2)=Cy3/Cy5, ie I think the Cy2 channel would only
>>> introduce additional noise.
>>>
>>> Anyway if any of you would have experience in using Limma for the analysis
>>> of proteomics datasets, please let me know... Right now in the proteomics
>>> community people are mostly using simple t-tests etc, but using moderated t
>>> statistics would obviously be much better...
>>> Maybe there would be some scope for writing a dedicated Bioconductor
>>> package for the analysis of proteomics 2D DIGE data, based on Limma or
>>> LimmaGUI, I don't know... I think most of the code would hardly need any
>>> changing, only the input of the data would need changing and maybe dealing
>>> with missing values could be better too (which is more of an issue in
>>> proteomics than in microarrays) (eg using a few preprocessing and analysis
>>> options, such as to leave them out, substitute by 0 or impute missing values
>>> using k nearest neighbours). I think this could greatly benefit the
>>> proteomics community.
>>>
>>> cheers,
>>> Tom Wenseleers
>>>
>>>
>>> Dr. T. Wenseleers
>>> Dept. of Biology
>>> Zoological Institute
>>> K.U.Leuven
>>> Naamsestraat 59
>>> B-3000 Leuven
>>> Belgium
>>> tel. +32 (0)16 32 39 64
>>> mobile +32 (0)472 40 45 96
>>> e-mail tom.wenseleers at bio.kuleuven.be
>>> web http://bio.kuleuven.be/ento/wenseleers/twenseleers.htm <Brain_kol56_targets2.txt><brain_56_g1.spot><brain_56_g2.spot>_______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>       
>> ----------------------------------------------------------------------
>> Tobias Straub   ++4989218075439   Adolf-Butenandt-Institute, München D
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>     
>
>
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor