[Bioc-devel] problem with matchprobes:getProbeDataAffy() -
James W. MacDonald
jmacdon at med.umich.edu
Mon Aug 25 17:31:11 CEST 2008
Hi Francesco,
Thanks for the bug report. This was fixed a while ago in the devel
version. Admittedly this should have been fixed in the release version
as well, which I will do ASAP.
Best,
Jim
Francesco Ferrari wrote:
> A few days ago I received a couple of bug reports concerning an error
> message occurring when using gcrma preprocessing procedure with custom
> probeset definitions (gahgu133acdf and gahgu133aprobe package).
>
> The source of the problem is one single missing probe sequence into
> the environment "gahgu133aprobe".
> I also verified that the same problem occurs on the other "probe"
> packages with custom definitions of probesets that I am currently
> maintaining: i.e. gahgu133bprobe, gahgu133plus2probe, ... etc.
>
>
> After carefully debugging the package generation procedure, I found
> the likely source of this problem into the function "getProbeDataAffy"
> from the matchprobes package, that is used to read the probetable from
> a TXT file, in order to generate the "probetable" object, that is
> subsequently used to create the "probe" package.
>
>
> #Within the function code, the following lines change the "datafile"
> argument of the function from a character, i.e. the path to the file,
> to a "connection" to the file itself.
>
> if (missing(datafile)) {
> datafile <- paste(arraytype, "_probe_tab", sep = "")
> } else {
> if (is(datafile, "character")) {
> datafile <- file(datafile, "r")
> on.exit(close(datafile))
> }
> }
>
> # Then a few lines below, the connection to the file is firstly used
> to read the header line, and then the remaining part of the data
> head <- scan(datafile, sep = "\t", quiet = TRUE, multi.line = FALSE,
> nlines = 1, what = "character")
> dat <- scan(datafile, sep = "\t", quiet = TRUE, multi.line = FALSE,
> what = what, skip = 1)
>
>
> The second call to the "scan()" function misses one of the lines of
> data, therefore there is one missing line in the resulting object.
> The problem can be solved just using the filename instead of a
> connection to access the file itself. I temporary solved the problem
> commenting the initial part of the function code as follows:
>
> if (missing(datafile)) {
> datafile <- paste(arraytype, "_probe_tab", sep = "")
> # } else {
> # if (is(datafile, "character")) {
> # datafile <- file(datafile, "r")
> # on.exit(close(datafile))
> # }
> }
>
>
> I think that my problem is due to the fact that, when using the
> connection instead of the file path, the connection itself "remembers"
> the last line that was read, thus the second call to the scan()
> function skips an additional line containing meaningful data.
>
> What do you think about this problem and the proposed "diagnosis" and solution?
>
> All the best,
> Francesco Ferrari
>
>
>
>
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] matchprobes_1.12.0 affy_1.18.2 preprocessCore_1.2.0
> [4] affyio_1.8.0 Biobase_2.0.1
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioc-devel
mailing list