[BioC] Help on PLGEM R Package Usage

Sat Sep 24 17:39:15 CEST 2011

You will have to set plotFile=FALSE if you want to override the
default png file.

Also, given the relatively small dataset you are using (~500
proteins), I recommend increasing the number of iterations of the
permutation step. The default Iterations="automatic" only uses 500
iterations in your case. However I would suggest setting it to at
least 1000 or even more. This will make p-values more stable from run
to run. I don't know if you noticed, but each time you run PLGEM you
get slightly different p-values. This is because the permutation step
is based on random resampling of your data and could be different from
run to run. Using a larger number of iterations stabilizes the
empirical distribution of resampled STN ratios, and makes p-values
more stable.

That said, if your data do not fit well to the PLGEM, then there is
little chance you can improve the results by tweaking these other
parameters.

Hope this helps!
Norman

On Sat, Sep 24, 2011 at 4:19 PM, Wu Qi <qwu at dicp.ac.cn> wrote:
> Dear Norman,
>
> The dataset is downloaded from Tranche website
> https://proteomecommons.org/dataset.jsp?!=73694 . I haven't gone through the
> experimental details yet.
> When I try to produce high quality figures following your instructions, I
> get a plot whose parameters are quite different using following commands, I
> guess this plot is generated with default arguments:
>
> NSAFSet<-readExpressionSet("exprs_NSAF.txt","phenoDataFile.txt")
> pdf()
> NSAFdegList<-run.plgem(NSAFSet, signLev=0.01, rank=100, covariate=1,
> baselineCondition="E", Iterations="automatic", trimAllZeroRows=TRUE,
> zeroMeanOrSD="trim", fitting.eval=TRUE, plotFile=TRUE, writeFiles=FALSE,
> Verbose=TRUE)
> dev.off()
>
> By these commands, I could still only get a fittingEval.png which is very
> small. How can I write fittingEval plot generated with my own arguments to
> other file formats?
>
>
> -----Original Message-----
> From: Norman Pavelka [mailto:normanpavelka at gmail.com]
> Sent: Saturday, September 24, 2011 1:23 AM
> To: Wu Qi
> Cc: bioconductor at r-project.org
> Subject: Re: Help on PLGEM R Package Usage
>
> Dear Qi,
>
> Thank you for the data and the plots. I think the problem might reside in
> your data. If you do a boxplot of your data you will notice that they do not
> span many orders of magnitude. Here's how you can see for
> yourself:
>
> test <- log10(exprs(NSAFSet))  # log-transform your data
> test[test == -Inf] <- NA     # to remove -Inf values coming from log10(0)
> boxplot(test)
>
> PLGEM fits best when data span several orders of magnitude, whereas in your
> case the NSAF values only span two orders of magnitude. May I ask you which
> proteomics technology you used to generate these data? Is this a whole-cell
> extract or a subproteome?
>
> Cheers,
> Norman
>
> On Sat, Sep 24, 2011 at 12:02 AM, Wu Qi <qwu at dicp.ac.cn> wrote:
>> Dear Norman,
>>
>> Thanks for your quick response, please find my attached files and plot.
>> I really don't understand how to optimize the arguments for every step
>> and I have more than one dataset which also need evaluation. So could
>> you possibly give me some advice on choosing arguments?
>> The commands for generating this plot is as follows:
>>
>> library(plgem)
>>
>> NSAFSet<-readExpressionSet("exprs_NSAF.txt","phenoDataFile.txt")
>>
>> NSAFdegList<-run.plgem(NSAFSet, signLev=0.01, rank=100, covariate=1,
>> baselineCondition="E", Iterations="automatic", trimAllZeroRows=TRUE,
>> zeroMeanOrSD="trim", fitting.eval=TRUE, plotFile=TRUE,
>> writeFiles=FALSE,
>> Verbose=TRUE)
>>
>> plgem.write.summary(NSAFdegList, prefix="NSAF", verbose=TRUE)
>>
>> Kind Regards,
>> Qi Wu
>>
>> -----Original Message-----
>> From: Norman Pavelka [mailto:normanpavelka at gmail.com]
>> Sent: Friday, September 23, 2011 11:38 PM
>> To: Wu Qi
>> Cc: bioconductor at r-project.org
>> Subject: Re: Help on PLGEM R Package Usage
>>
>> Hi Qi,
>>
>> These fitting values look very outside the optimal range. Do you
>> actually get a straight line in the ln(sd) vs. ln(mean) plot? If not,
>> something might be wrong about how the data were normalized. You may
>> e-mail me offline your data and/or the fitting evaluation plots and I
>> might be able to diagnose the problem.
>>
>> The slope is one of the most important parameters to look at, and it
>> usually should be between 0.5 and 1. The r^2 and Pearson correlation
>> coefficients should be as close to 1 as possible.
>>
>> In order to capture the plots in another file format you can call
>> pdf() prior to run.plgem() to generate a high-quality vector-graphics
>> PDF file. Example:
>>
>> library(plgem)
>> data(LPSeset)
>> pdf()      # this will open a new PDF file called 'Rplots.pdf'
>>           # in your current working directory plgemOutput <-
>> run.plgem(LPSeset)
>> dev.off()  # this will close the PDF file
>>
>> Instead of pdf() above you can try bmp(), jpeg(), tiff() or virtually
>> any other major image file format. Under Windows there is also
>> win.metafile() that generates EMF image file format.
>>
>> Hope this helps!
>> Norman
>>
>> On Fri, Sep 23, 2011 at 11:06 PM, Wu Qi <qwu at dicp.ac.cn> wrote:
>>> Dear Norman,
>>>
>>>
>>>
>>> Thanks for your further advice.
>>>
>>> After applying the arguements you recommend, The parameters for my
>>> NSAF dataset are: slope=0.291, intercept=-5.35, adj.r2=0.636,
>>> Pearson=0.464. Are they horrible?
>>>
>>> Could you tell me which is the most important parameter to assess my
>>> dataset quality?
>>>
>>> And how can I export high quality figure (emf format) with these
>> parameters?
>>> I could only find it in the simplest wrapper mode. When I append
>>> "plotFile=TRUE" in run.plgem function, I could only get a png figure
>>> whose resolution is really poor.
>>>
>>>
>>>
>>> Best Regards,
>>>
>>> Qi Wu
>>
>
>