[BioC] pd.mapping250k.sty package: featureSet:fragment_length
Zhu, Julie
Julie.Zhu at umassmed.edu
Fri Sep 17 19:25:28 CEST 2010
Jim,
Thank you very much for the detailed information! It all makes sense.
Best regards,
Julie
On 9/17/10 12:58 PM, "James W. MacDonald" <jmacdon at med.umich.edu> wrote:
> Hi Julie,
>
> On 9/17/2010 9:53 AM, Zhu, Julie wrote:
>> Hi,
>>
>> Could someone please tell me whether the fragment_length in the featureSet
>> of pd.mapping250k.sty is the fragment_length of the sample? Are there
>> documentations available for looking up the meanings of each field?
>
> The fragment_length is the length of the restriction fragment. You could
> hypothetically have figured this out yourself by comparing the fragment
> length to the data on the netaffx site. Unfortunately, it looks like the
> current version of the pd.mapping250k.sty package is out of date when
> compared to what netaffx has, as the fragment length data for these two
> probesets don't agree.
>
> This is not true of the pd.genomewidesnp.6 package, which is what I have
> installed. So for instance,
>
>> dbGetQuery(con, "select fragment_length, fragment_length2, man_fsetid
> from featureSet limit 10;")
> fragment_length fragment_length2 man_fsetid
> 1 395 217 SNP_A-2131660
> 2 NA 702 SNP_A-1967418
> 3 633 883 SNP_A-1969580
> 4 831 399 SNP_A-4263484
> 5 970 611 SNP_A-1978185
> 6 1508 711 SNP_A-4264431
> 7 NA 921 SNP_A-1980898
> 8 NA 243 SNP_A-1983139
> 9 NA 194 SNP_A-4265735
> 10 420 858 SNP_A-1995832
>
> the fragment_length and fragment_length2 data here do agree (well, at
> least the two I checked agree ;-P) with netaffx.
>
> As for the other field names, most seem clear to me. Is there one in
> particular that is not clear?
>
>>
>> Some rows have NAs for most the fields even though the allele information is
>> known, is this expected?
>
> It is expected, depending on when the package was built. We are simply
> taking data from Affymetrix and re-packaging into an object that is
> easier to use, so we are dependent on the data we get from Affy. Since
> annotation of genetic data is a moving target, things are always changing.
>
> We only build these packages on a semi-annual basis, so we end up out of
> date quite quickly. This is a tradeoff between having the most
> up-to-date data, and having stable data packages that people can rely on.
>
> We do provide the functionality to build your own, so if you desire the
> most up-to-date package, you can build a personal package using the
> pdInfoBuilder package.
>
> Best,
>
> Jim
>
>
>>
>> Thanks so much for your help!
>>
>> library("pd.mapping250k.sty")
>> con = db(pd.mapping250k.sty)
>> dbListFields(con, "featureSet")
>> [1] "fsetid" "man_fsetid" "dbsnp_rs_id" "chrom"
>> [5] "physical_pos" "strand" "cytoband" "allele_a"
>> [9] "allele_b" "gene_assoc" "fragment_length" "dbsnp"
>> [13] "cnv"
>>
>> dbGetQuery(con, "select * from featureSet order by fsetid desc limit 2")
>> fsetid man_fsetid dbsnp_rs_id chrom physical_pos strand cytoband
>> allele_a allele_b
>> 1 238378 SNP_A-4301986 rs6989223 8 5214036 - p23.2
>> A G
>> 2 238377 SNP_A-2291495 rs11644392<NA> NA<NA> <NA>
>> A G
>> fragment_length dbsnp
>> 1 1667 0
>> 2 NA NA
>>
>>
>> Best regards,
>>
>> Julie
>>
>> sessionInfo()
>> R version 2.11.1 (2010-05-31)
>> x86_64-apple-darwin9.8.0
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] pd.mapping250k.sty_1.0.0 RSQLite_0.9-2 DBI_0.2-5
>> [4] oligo_1.12.2 oligoClasses_1.10.0 Biobase_2.8.0
>> [7] affxparser_1.20.0
>>
>> loaded via a namespace (and not attached):
>> [1] affyio_1.16.0 Biostrings_2.16.9 IRanges_1.6.11
>> preprocessCore_1.10.0
>> [5] splines_2.11.1 tools_2.11.1
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list