[BioC] Communicating gene-probe relationships to sam

Charles Crane ccrane at purdue.edu
Fri May 5 19:15:40 CEST 2006


Dear newsgroup:
    I am trying to use sam to identify single-feature polymorphisms between
two barley cultivars.  I am running Bioconductor 1.7, Biobase 1.8.0, affy
1.8.1, and siggenes 1.4.0 under Mac OS X 10.3.9.  I can successfully extract
probe-level data from my CEL files, background correct the pm values with
rma, quantile normalize, and isolate the pm intensities in a data frame, but
both sam and sam.snp consistently and instantly fail with "Error in
FUN(data, cl, ...) : There should be at least 25 samples."  This happens
even though there are 22840 genes and 11 probes per gene in the data.  What
am I overlooking?  Does sam need some sort of a mapping from the 11 probes
to their parent gene?  A session history follows.
    Thank your very much for your time.  Also, do you plan to include
directions for using sam for SNP analysis, going all the way from ReadAffy
through sam for a case history, in a future vignette?  It would save the
uninitiated a lot of time and frustration.

Charles Crane
USDA-ARS, MWA, Crop Production and Pest Control Research Unit
And Department of Botany and Plant Pathology, Purdue University

> library(affy)
> library(siggenes)
> sidata <- ReadAffy(filenames = c("000113_5hr_kindered_H20_inoc_2x2.CEL",
"000114_12hr_kindered_H20_inoc_3bx1.CEL",
+  "000122_5hr_peruvian_H20_inoc_14bx1.CEL",
"000123_12hr_peruvian_H20_inoc_15bx1.CEL",
"000141_Non-host-resistant-barley_16bx2.CEL",
+  "000157_24hr_Kindered_H2O_Inoc_1.CEL",
"000158_0hr_Kindered_H2O_Inoc_4.CEL",
"000159_0hr_Kindered_S_passerini_inoc_5.CEL",
+  "000161_0hr_Peruvian_H2O_inoc_13.CEL",
"000162_0hr_Peruvian_S_passerini_17.CEL",
"000189_0hr_403-rep2_H2O_inoc_21b.CEL",
+  "000190_5hr_403-rep2_H2O_inoc_22b.CEL",
"000191_12hr_403-rep2_H2O_inoc_23b.CEL",
"000192_24hr_403-rep2_H2O_inoc_24b.CEL",
+  "000193_0hr_403-rep2_inoc-s_pass_25b.CEL",
"000201_0hr_405-rep2_H2O_inoc_33b.CEL",
"000202_5hr_405-rep2_H2O_inoc_34b.CEL",
+  "000203_12hr_405-rep2_H2O_inoc_35b.CEL",
"000204_24hr_405-rep2_H2O_inoc_36b.CEL",
"000205_0hr_405-rep2_inoc-s_pass_37b.CEL"),
+  sampleNames = c("000113_5hr_kindered_H20_inoc_2x2",
"000114_12hr_kindered_H20_inoc_3bx1", "000122_5hr_peruvian_H20_inoc_14bx1",
+  "000123_12hr_peruvian_H20_inoc_15bx1",
"000141_Non-host-resistant-barley_16bx2", "000157_24hr_Kindered_H2O_Inoc_1",
+  "000158_0hr_Kindered_H2O_Inoc_4",
"000159_0hr_Kindered_S_passerini_inoc_5", "000161_0hr_Peruvian_H2O_inoc_13",
+  "000162_0hr_Peruvian_S_passerini_17", "000189_0hr_403-rep2_H2O_inoc_21b",
"000190_5hr_403-rep2_H2O_inoc_22b",
+  "000191_12hr_403-rep2_H2O_inoc_23b", "000192_24hr_403-rep2_H2O_inoc_24b",
"000193_0hr_403-rep2_inoc-s_pass_25b",
+  "000201_0hr_405-rep2_H2O_inoc_33b", "000202_5hr_405-rep2_H2O_inoc_34b",
"000203_12hr_405-rep2_H2O_inoc_35b",
+  "000204_24hr_405-rep2_H2O_inoc_36b",
"000205_0hr_405-rep2_inoc-s_pass_37b"),
+  phenoData = "Hordeumphenodata.txt")
> signames <- geneNames(sidata)
> sirma <- bg.correct(sidata, method = "rma")
> sinorm <- normalize(sirma)
> siprobeintensities <- as.data.frame(probes(sinorm, "pm"))
> write.table(siprobeintensities, file = "siprobeintensities.txt")
> sicl <- c(1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
> samout <- sam(siprobeintensities, sicl, method = "cat.stat", gene.names =
dimnames(siprobeintensities)[[1]], B = 1000, ran = 11279)
Error in FUN(data, cl, ...) : There should be at least 25 samples.
> sigenenames <- scan(file = "sigenenamestrimmed.txt", what = 'character')
Read 251437 items
> sigenenames[1:22]
 [1] "1200459_Reg_88-1740_at"  "1200459_Reg_88-1740_at"
 [3] "1200459_Reg_88-1740_at"  "1200459_Reg_88-1740_at"
 [5] "1200459_Reg_88-1740_at"  "1200459_Reg_88-1740_at"
 [7] "1200459_Reg_88-1740_at"  "1200459_Reg_88-1740_at"
 [9] "1200459_Reg_88-1740_at"  "1200459_Reg_88-1740_at"
[11] "1200459_Reg_88-1740_at"  "1289374_Reg_826-1545_at"
[13] "1289374_Reg_826-1545_at" "1289374_Reg_826-1545_at"
[15] "1289374_Reg_826-1545_at" "1289374_Reg_826-1545_at"
[17] "1289374_Reg_826-1545_at" "1289374_Reg_826-1545_at"
[19] "1289374_Reg_826-1545_at" "1289374_Reg_826-1545_at"
[21] "1289374_Reg_826-1545_at" "1289374_Reg_826-1545_at"
> samout <- sam(siprobeintensities, sicl, method = "cat.stat", gene.names =
sigenenames, B = 1000, ran = 11279)
Error in FUN(data, cl, ...) : There should be at least 25 samples.



More information about the Bioconductor mailing list