[BioC] Unable to \'standardise\' logtransformed dataset of contrasts using Mfuzz package

Wed Jun 26 23:17:20 CEST 2013

Dear Maintainer,

I'm analyzing metabolomic LC-MS intensity values.
To reduced these large numbers, I log transformed the dataset. To replace any Inf/-Inf/NA with a zero:
> logtransfo[!is.finite(logtransfo)]<-0
and checked for NAs:
> is.na(logtransfo))
            CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295        FALSE             FALSE             FALSE              FALSE
109.1012434        FALSE             FALSE             FALSE              FALSE

All cells printed FALSE.
I made contrasts using limma package. 
> head(wCA12m)
            CA.2wk11.H11   CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295 " 0.018961357" "-0.091637119"    " 3.268257162"    "-1.025643391"    
109.1012434 " 0.146168274" "-0.055655014"    " 3.172041095"    "-0.969301615" 

Made an expression set:
> wCA12me
ExpressionSet (storageMode: lockedEnvironment)
assayData: 124 features, 4 samples 
  element names: exprs 
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

is.na(exprs(wCA12me))
            CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295        FALSE             FALSE             FALSE              FALSE
109.1012434        FALSE             FALSE             FALSE              FALSE

I loaded library(Mfuzz), and went through the steps as indicated in the manual.

> wCA12me.r=filter.NA(wCA12me)
0 genes excluded.
> wCA12me.f=fill.NA(wCA12me.r, mode="knn") #after failing to standardise, I also tried using the other mode options.

I could get a nice plot with "knn" and "knnw", but using "mean" and "median" gave an error for fill.NA. 

> tmp=filter.std(wCA12me, min.std=0)
0 genes excluded.

Also, changed the min.std value.
> tmp=filter.std(wCA12me, min.std=2)
67 genes excluded.

For either case of changing the mode="", and min.std="",
I always get the same error message when using the call to 'standardise':

> wCA12me.s=standardise(wCA12me.f)
Error in data[i, ] - mean(data[i, ], na.rm = TRUE) : 
  non-numeric argument to binary operator
In addition: Warning message:
In mean.default(data[i, ], na.rm = TRUE) :
  argument is not numeric or logical: returning NA

Checking my file several times, I showed that no data points contain NA. I think I understand what the error is saying, but I didn't expect negative values to affect
the clustering algorithm. I was able to complete the package with non-transformed values, however, the transformed values give slightly different results, and wanted to compare the non-transformed and log-transformed datasets. 

This being LC-MS metabolomic data, could I use a different function to transform the data to not get negative values? 

Thanks for your attention.
Regards,
Franklin 

 -- output of sessionInfo(): 

R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] tcltk     parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] limma_3.16.5         Mfuzz_2.18.0         DynDoc_1.38.0       
[4] widgetTools_1.38.0   e1071_1.6-1          class_7.3-7         
[7] Biobase_2.20.0       BiocGenerics_0.6.0   BiocInstaller_1.10.2

loaded via a namespace (and not attached):
[1] tkWidgets_1.38.0 tools_3.0.1    

--
Sent via the guest posting facility at bioconductor.org.