[Bioc-devel] problem with matchprobes:getProbeDataAffy() -

James W. MacDonald jmacdon at med.umich.edu
Mon Aug 25 17:31:11 CEST 2008


Hi Francesco,

Thanks for the bug report. This was fixed a while ago in the devel 
version. Admittedly this should have been fixed in the release version 
as well, which I will do ASAP.

Best,

Jim



Francesco Ferrari wrote:
> A few days ago I received a couple of bug reports concerning an error
> message occurring when using gcrma preprocessing procedure with custom
> probeset definitions (gahgu133acdf and gahgu133aprobe package).
> 
> The source of the problem is one single missing probe sequence into
> the environment "gahgu133aprobe".
> I also verified that the same problem occurs on the other "probe"
> packages with custom definitions of probesets that I am currently
> maintaining: i.e. gahgu133bprobe, gahgu133plus2probe, ... etc.
> 
> 
> After carefully debugging the package generation procedure, I found
> the likely source of this problem into the function "getProbeDataAffy"
> from the matchprobes package, that is used to read the probetable from
> a TXT file, in order to generate the "probetable" object, that is
> subsequently used to create the "probe" package.
> 
> 
> #Within the function code, the following lines change the "datafile"
> argument of the function from a character, i.e. the path to the file,
> to a "connection" to the file itself.
> 
>   if (missing(datafile)) {
>         datafile <- paste(arraytype, "_probe_tab", sep = "")
>     } else {
>         if (is(datafile, "character")) {
>             datafile <- file(datafile, "r")
>             on.exit(close(datafile))
>         }
>      }
> 
> # Then a few lines below, the connection to the file is firstly used
> to read the header line, and then the remaining part of the data
>    head <- scan(datafile, sep = "\t", quiet = TRUE, multi.line = FALSE,
>         nlines = 1, what = "character")
>     dat <- scan(datafile, sep = "\t", quiet = TRUE, multi.line = FALSE,
>         what = what, skip = 1)
> 
> 
> The second call to the "scan()" function misses one of the lines of
> data, therefore there is one missing line in the resulting object.
> The problem can be solved just using the filename instead of a
> connection to access the file itself. I temporary solved the problem
> commenting the initial part of the function code as follows:
> 
>  if (missing(datafile)) {
>         datafile <- paste(arraytype, "_probe_tab", sep = "")
> #    } else {
> #        if (is(datafile, "character")) {
> #            datafile <- file(datafile, "r")
> #            on.exit(close(datafile))
> #        }
>      }
> 
> 
> I think that my problem is due to the fact that, when using the
> connection instead of the file path, the connection itself "remembers"
> the last line that was read, thus the second call to the scan()
> function skips an additional line containing meaningful data.
> 
> What do you think about this problem and the proposed "diagnosis" and solution?
> 
> All the best,
> Francesco Ferrari
> 
> 
> 
> 
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i686-pc-linux-gnu
> 
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> [8] base
> 
> other attached packages:
> [1] matchprobes_1.12.0   affy_1.18.2          preprocessCore_1.2.0
> [4] affyio_1.8.0         Biobase_2.0.1
> 
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662



More information about the Bioc-devel mailing list