[BioC] Unable to \'standardise\' logtransformed dataset of contrasts using Mfuzz package
FRANKLIN JOHNSON [guest]
guest at bioconductor.org
Wed Jun 26 23:17:20 CEST 2013
Dear Maintainer,
I'm analyzing metabolomic LC-MS intensity values.
To reduced these large numbers, I log transformed the dataset. To replace any Inf/-Inf/NA with a zero:
> logtransfo[!is.finite(logtransfo)]<-0
and checked for NAs:
> is.na(logtransfo))
CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295 FALSE FALSE FALSE FALSE
109.1012434 FALSE FALSE FALSE FALSE
All cells printed FALSE.
I made contrasts using limma package.
> head(wCA12m)
CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295 " 0.018961357" "-0.091637119" " 3.268257162" "-1.025643391"
109.1012434 " 0.146168274" "-0.055655014" " 3.172041095" "-0.969301615"
Made an expression set:
> wCA12me
ExpressionSet (storageMode: lockedEnvironment)
assayData: 124 features, 4 samples
element names: exprs
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation:
is.na(exprs(wCA12me))
CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11
123.1166295 FALSE FALSE FALSE FALSE
109.1012434 FALSE FALSE FALSE FALSE
I loaded library(Mfuzz), and went through the steps as indicated in the manual.
> wCA12me.r=filter.NA(wCA12me)
0 genes excluded.
> wCA12me.f=fill.NA(wCA12me.r, mode="knn") #after failing to standardise, I also tried using the other mode options.
I could get a nice plot with "knn" and "knnw", but using "mean" and "median" gave an error for fill.NA.
> tmp=filter.std(wCA12me, min.std=0)
0 genes excluded.
Also, changed the min.std value.
> tmp=filter.std(wCA12me, min.std=2)
67 genes excluded.
For either case of changing the mode="", and min.std="",
I always get the same error message when using the call to 'standardise':
> wCA12me.s=standardise(wCA12me.f)
Error in data[i, ] - mean(data[i, ], na.rm = TRUE) :
non-numeric argument to binary operator
In addition: Warning message:
In mean.default(data[i, ], na.rm = TRUE) :
argument is not numeric or logical: returning NA
Checking my file several times, I showed that no data points contain NA. I think I understand what the error is saying, but I didn't expect negative values to affect
the clustering algorithm. I was able to complete the package with non-transformed values, however, the transformed values give slightly different results, and wanted to compare the non-transformed and log-transformed datasets.
This being LC-MS metabolomic data, could I use a different function to transform the data to not get negative values?
Thanks for your attention.
Regards,
Franklin
-- output of sessionInfo():
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] tcltk parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] limma_3.16.5 Mfuzz_2.18.0 DynDoc_1.38.0
[4] widgetTools_1.38.0 e1071_1.6-1 class_7.3-7
[7] Biobase_2.20.0 BiocGenerics_0.6.0 BiocInstaller_1.10.2
loaded via a namespace (and not attached):
[1] tkWidgets_1.38.0 tools_3.0.1
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list