[BioC] Fwd: Help on PLGEM R Package Data Import

Thu Sep 22 04:42:45 CEST 2011

I am reposting to the mailing list as it was bounced the first time.
Sorry if you received this more than once.

Best,
Norman

---------- Forwarded message ----------
From: Norman Pavelka <normanpavelka at gmail.com>
Date: 2011/9/22
Subject: Re: Help on PLGEM R Package Data Import
To: Wu Qi <qwu at dicp.ac.cn>
Cc: bioconductor at stat.math.ethz.ch, mattia.pelizzola at gmail.com

Dear Qi,

Thank you for your interest in PLGEM!

I am CC'ing my reply to the Bioconductor mailing list, as this is the
best forum to address your question. I strongly recommend you to
subscribe and send further queries there. (You may always CC me to get
a more rapid response.)

Your question is about how to load your data into R/Bioconductor.
Since the object that PLGEM needs as an input is of type
'ExpressionSet', you'll have to learn how to build such an object in
R. Doing it from scratch is a bit cumbersome, but you could use
function 'readExpressionSet' from package Biobase to make your life
easier. Type the following in your R prompt to get the help page:

library(Biobase)
?readExpressionSet

For PLGEM, you will only need a single 'exprsFile' and a single 'phenoDataFile':

* The 'exprsFile' is going to be a tab-delimited text file in which
the first column contains your protein identifiers and the subsequent
columns contain NSAF values from the various MS runs you performed. Be
sure to put a meaningful header on top of each column (except for the
first column). Do not use any spaces or special characters in your
column headers, though, because it will cause some problems. For those
proteins that were not identified in all your runs, replace the
missing values with a zero.

* The 'phenoDataFile' instead is going to be a description of your
columns in your 'exprsFile', i.e. a description of your experimental
design. Note that the row names of the 'phenoDataFile' need to exactly
match the column names of the 'exprsFile'.

To make it easier, I'm attaching an example with some random numbers.
Copy these two files into your working directory and run the following
code:

library(plgem)
eset <- readExpressionSet("example-exprsFile.txt", "example-phenoDataFile.txt")
plgemResult <- run.plgem(eset)

(Of course the results are going to look aweful, because I just put in
some random numbers...)
Please direct further queries directly to the Bioconductor mailing
list. Good luck and let me know how it worked!

Cheers,
Norman

On Wed, Sep 21, 2011 at 10:37 AM, Wu Qi <qwu at dicp.ac.cn> wrote:
> Dear Norman,
>
>
>
> My name is Qi Wu, I’m a Chinese student working on quantitative proteomics,
> recently your PLGEM algorithm interested me. It seems a better choice than
> conventional t test.
>
> I’m a beginner in statistics, after installing PLGEM R package, I followed
> the instruction on “An introduction to PLGEM, Mattia Pelizzola and Norman
> Pavelka, April 13, 2011” running the wrapper mode and got the sample
> figures. But I don’t know how to import my own data. I couldn’t open the
> sample data named “LPSeset” using Excel or UltraEdit, so I had no idea how
> the data was organized. Now I could generate replicate Excel or plain text
> files containing proteins abundance values of different status, could you
> tell me how can I import such data in PLGEM R package and get a list
> containing those significantly changed proteins? I searched the internet for
> quite a long time and got nothing.
>
> Thanks very much for contributing your wonderful algorithm, your reply is
> high appreciated.
>
>
>
> Best Regards,
>
> Qi Wu
>
>