[BioC] makePdInfoPackage in preparation for RMA with oligo on Nimblegen Expression Arrays
Jack Schonbrun
schonbrun at amyris.com
Wed Jul 15 00:05:02 CEST 2009
> xys <- read.delim(xysFile, comment='#', nrow=3)
> str(xys)
'data.frame': 3 obs. of 4 variables:
$ X : int 209 228 43
$ Y : int 203 52 257
$ SIGNAL: num 203 146 159
$ COUNT : int 1 1 1
-----Original Message-----
From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu]
Sent: Tuesday, July 14, 2009 3:03 PM
To: Jack Schonbrun
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] makePdInfoPackage in preparation for RMA with oligo on Nimblegen Expression Arrays
how about?
xys <- read.delim(xysFile, comment="#", nrow=100)
str(xys)
b
On Jul 14, 2009, at 6:58 PM, Jack Schonbrun wrote:
> Here's what I get:
>
>> ndf <- read.delim(ndfFile, stringsAsFactors=FALSE, nrow=100)
>> str(ndf)
> 'data.frame': 100 obs. of 17 variables:
> $ PROBE_DESIGN_ID : chr "6531_0301_0005" "6531_0311_0005"
> "6531_0331_0005" "6531_0333_0005" ...
> $ CONTAINER : chr "SACCHAROMYCES1" "SACCHAROMYCES1"
> "NGS_CONTROLS" "NGS_CONTROLS" ...
> $ DESIGN_NOTE : chr "rank_selected" "rank_selected" "upper
> right fiducial" "" ...
> $ SELECTION_CRITERIA: chr "rank:03;score:379;uniq:14;count:37;freq:
> 01;rules:1;tm:82.4" "rank:05;score:046;uniq:14;count:1110;freq:
> 30;rules:1;tm:78.3" "bright" "" ...
> $ SEQ_ID : chr "SCER070900001885" "SCER070900001596"
> "FIDUCIAL_UPPER_RIGHT" "CROSSHYBE" ...
> $ PROBE_SEQUENCE : chr
> "GTCAACCCTGCAAGATCTCTGGGTGCCGCCGTTGCTGCCAGATATTTCCCTCATTACCAC"
> "TCAGTTGGAACGCCTCTGAGCACTCCATCACCTGAGTCAGGTAATACATTTACTGATTCA"
> "TGAGTTGTTTGATAGGATTATTCATAGAGGTCATTACAGCGAGAGGAANNNNNNNNN"
> "CGATGCGACGCGAACTAAGCAGTTCGGCGCAGTCGACTAGTATAACAGNNNNNNNN" ...
> $ MISMATCH : int 0 0 0 0 0 0 0 0 0 0 ...
> $ MATCH_INDEX : int 72062965 72061238 2000207 70654015
> 70652179 65069272 65069273 65069274 65069275 65069276 ...
> $ FEATURE_ID : int 72062965 72061238 71722817 71722819
> 71722820 71722824 71722825 71722826 71722827 71722828 ...
> $ ROW_NUM : int 5 5 5 5 6 6 6 6 6 6 ...
> $ COL_NUM : int 301 311 331 333 1 5 6 7 8 9 ...
> $ PROBE_CLASS : chr "experimental" "experimental" "fiducial"
> "control:crosshybe" ...
> $ PROBE_ID : chr "SCER070900001885P00271"
> "SCER070900001596P00406" "CPK6" "XENOTRACK48P02" ...
> $ POSITION : int 271 406 0 2 0 0 5 0 6 0 ...
> $ DESIGN_ID : int 6531 6531 6531 6531 6531 6531 6531 6531
> 6531 6531 ...
> $ X : int 301 311 331 333 1 5 6 7 8 9 ...
> $ Y : int 5 5 5 5 6 6 6 6 6 6 ...
>>
>
> -----Original Message-----
> From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu]
> Sent: Tuesday, July 14, 2009 2:56 PM
> To: Jack Schonbrun
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] makePdInfoPackage in preparation for RMA with
> oligo on Nimblegen Expression Arrays
>
> what do you get if you run the following (assuming ndfFile is a
> variable has the file name)?
>
> ndf <- read.delim(ndfFile, stringsAsFactors=FALSE, nrows=100)
> str(ndf)
>
> thanks,
>
> b
>
> On Jul 14, 2009, at 6:49 PM, Jack Schonbrun wrote:
>
>> Benilton,
>>
>> Thanks for your suggestions.
>>
>> By every means I have tested, the file is tab delimited. And the
>> first row is headers, all other data.
>>
>> Here is how the first (header) row looks:
>> PROBE_DESIGN_ID CONTAINER DESIGN_NOTE
>> SELECTION_CRITERIA SEQ_ID PROBE_SEQUENCE MISMATCH
>> MATCH_INDEX FEATURE_ID ROW_NUM COL_NUM PROBE_CLASS
>> PROBE_ID POSITION DESIGN_ID X Y
>>
>> Any other details on how the ndf is expected to look?
>>
>> Thanks again,
>> Jack
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu]
>> Sent: Tuesday, July 14, 2009 1:34 PM
>> To: Jack Schonbrun
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] makePdInfoPackage in preparation for RMA with
>> oligo on Nimblegen Expression Arrays
>>
>> Jack,
>>
>> it looks like your NDF isn't as expected.
>>
>> When it shows: "inserting 0 rows into table 'featureSet'", it makes
>> me
>> wonder how the SEQ_ID column in the NDF looks like.
>>
>> But, instead of looking at the columns' contents right now, please
>> make sure the delimiters of the NDF are tabs. It doesn't appear
>> that's
>> the case. Note the warning "In max(ndfdata[["X"]]): no non-missing
>> arguments to max; returning -Inf"... It suggests that ndfdata[["X"]]
>> is NULL.
>>
>> Another thing: ensure the first line of the NDF is the header (column
>> names) and the data start on the 2nd line.
>>
>> PLease let me know how it goes.
>>
>> b
>>
>> On Jul 14, 2009, at 3:57 PM, Jack Schonbrun wrote:
>>
>>> Hello,
>>>
>>> I would like to use the oligo package to run the RMA algorithm on
>>> Nimblegen expression arrays. To that end, I am attempting to
>>> construct an annotation package using makePdInfoPackage().
>>>
>>> I have followed the pattern in the "Building Annotation Packages
>>> with pdInfoBuilder
>>> for Use with the oligo Package" vignette:
>>>
>>> ----------------
>>>
>>>> ndfFile.test <- "test.ndf"
>>>> xysFile.test <- "test.xys"
>>>> seed.test <- new("NgsExpressionPDInfoPkgSeed", ndfFile =
>>>> ndfFile.test, xysFile = xysFile.test)
>>>> makePdInfoPackage(seed.test, destDir = "./Annotation")
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> ====================================================================
>>> Building annotation package for Nimblegen Expression Array
>>> NDF: test.ndf
>>> XYS: test.xys
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> ====================================================================
>>> Parsing file: test.ndf ... OK
>>> Parsing file: test.xys ... OK
>>> Merging NDF and XYS files ...OK
>>> Preparing contents for featureSet table ...OK
>>> Preparing contents for bgfeature table ...OK
>>> Preparing contents for pmfeature table ...OK
>>> Creating package in ./Annotation/pd.test
>>> Inserting 0 rows into table "featureSet"... Error in
>>> sqliteExecStatement(con, statement, bind.data) :
>>> RS-DBI driver: (incomplete data binding: expected 2 parameters, got
>>> 0)
>>> In addition: Warning messages:
>>> 1: In max(ndfdata[["Y"]]) :
>>> no non-missing arguments to max; returning -Inf
>>> 2: In max(ndfdata[["X"]]) :
>>> no non-missing arguments to max; returning -Inf
>>> 3: In sqliteExecStatement(con, statement, bind.data) :
>>> ignoring zero-row bind.data
>>>
>>> ------------------
>>>
>>> Any help on why it would only be inserting 0 rows, or any of the
>>> other messages would be greatly appreciated. It does make some
>>> files in the destDir, but does not run to completion. Listing of
>>> this directory available if it would help.
>>>
>>> I am running on Windows XP SP 2. sessionInfo follows.
>>>
>>>> sessionInfo()
>>> R version 2.9.1 (2009-06-26)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.
>>> 1252;LC_MONETARY=English_United States.
>>> 1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] pdInfoBuilder_1.8.1 affxparser_1.16.0
>>> RSQLite_0.7-1 DBI_0.2-4
>>> makePlatformDesign_1.8.0 oligo_1.8.1
>>> [7] preprocessCore_1.6.0 oligoClasses_1.6.0
>>> Biobase_2.4.1 affyio_1.12.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] Biostrings_2.12.7 IRanges_1.2.3 splines_2.9.1
>>> tools_2.9.1
>>>
>>>
>>> ===========================
>>> Jack Schonbrun Ph.D.
>>> Software Developer
>>> Amyris Biotech
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
More information about the Bioconductor
mailing list