[BioC] Arabidopsis chromosome location mappings

Herve Pages hpages at fhcrc.org
Fri Oct 10 22:33:59 CEST 2008


Hi Sam, Marc, Cara,

Another way to get the chromosome lengths is to load the Arabidopsis genome
and to use the seqlengths() function on it:

 > library(BSgenome.Athaliana.TAIR.04232008)
 > Athaliana
 > seqlengths(Athaliana)
     chr1     chr2     chr3     chr4     chr5     chrC     chrM
30432563 19705359 23470805 18585042 26992728   154478   366924

seqlengths() is new in BioC 2.3 (our next release, scheduled in less than
2 weeks) so make sure you use the current devel version of Bioconductor for now.

Also, BSgenome.Athaliana.TAIR.04232008 is new in BioC 2.3 so now 2 versions
of this genome are available: the snapshot from January 22, 2004 and the snapshot
from April 23, 2008. Note that the names of the chromosomes have changed between
the 2 versions but their lengths remain the same.

See ?Athaliana for the details on which files were used to make this BSgenome
data package.

Use available.genomes() from the BSgenome software package to get the list of
all BSgenome data packages that are currently available on the Bioconductor
repositories for your version of R/Bioconductor.

Cheers,
H.


Cara Winter wrote:
> Marc,
> 
> TAIR (www.arabidopsis.org) is the official source for all Arabidopsis sequence and annotation information.  Here is a link that contains the chromosome lengths and other genome assembly information:
> 
> http://www.arabidopsis.org/portals/genAnnotation/gene_structural_annotation/agicomplete.jsp
> 
> Any questions regarding Arabidopsis sequence data can be sent to curator at arabidopsis.org.  Thank you very much for the interest in including Arabidopsis data into the Bioconductor packages.
> 
> Best, 
> Cara
> 
> --
> Cara Winter
> Cell and Molecular Biology Graduate Group
> University of Pennsylvania School of Medicine
> Philadelphia, PA  19104
> Phone: 215-266-1703
> email: caramw at mail.med.upenn.edu
> 
> ----- Original Message -----
> From: "Marc Carlson" <mcarlson at fhcrc.org>
> To: "Samuel Wuest" <wuests at tcd.ie>
> Cc: bioconductor at stat.math.ethz.ch
> Sent: Monday, October 6, 2008 12:11:52 PM GMT -05:00 US/Canada Eastern
> Subject: Re: [BioC] Arabidopsis chromosome location mappings
> 
> Hi Samuel,
> 
> The CHRLENGTHS mapping would just be a vector of all named chromosome
> lengths for Arabidopsis.  If we had one for arabidopsis, it would not
> contain the the chromosome locations mappings for much of anything.  We
> normally get CHRLENGTHS mapping information from UCSC, but unfortunately
> they don't cover Arabidosis there, so we don't have a source for this
> information.  But since all this is, is a named vector of the chromosome
> lengths, then if you know this information, you could probably fill it
> in pretty easily by just creating a named vector.  Also, if you have a
> recommendation for a reliable public source of this information that is
> considered trustworthy by the arabidopsis community for this, please
> tell me about it so that we can know about it too.
> 
> If you really want the location of the start of these genes along the
> chromosomes, that information (from TAIR) is present in the
> ath1121501CHRLOC mapping.  And if you want the ends, then you can find
> those in the ath1121501CHRLOCEND mapping (but this last mapping is only
> found in the most recent devel packages). 
> 
> Please let me know if I answered your questions,
> 
> 
>   Marc
> 
> 
> 
> 
> Samuel Wuest wrote:
>> Hi,
>>
>> Hope you're fine…
>> I am trying to make whole genome plots using the geneplotter
>> package/annotate package. The organism I am studying is Arabidopsis
>> thaliana, and obviously the annotations are not so extensive there: when
>> trying to build a chromLocation object, I can't obviously do that (see error
>> below)
>> It is obvious to me, that the chromosome location mappings are not provided
>> in the Arabidopsis anntation package (see below).
>>
>> My question: is there any way of plotting Arabidopsis gene expression data
>> along a chromosome. Should I just order the GeneIds (luckily, for the TAIR
>> Ids one can infer the gene order along a chromosome)? Has anyone made a
>> script for this?
>>
>> Thanks for any help, best wishes,
>>
>> Sam
>>
>>
>>   
>>> library(ath1121501.db)
>>> newChrClass <- buildChromLocation("ath1121501")
>>>     
>> Error in get(mapName, envir = pkgEnv, inherits = FALSE) :
>>   variable "ath1121501CHRLENGTHS" was not found
>>
>>   
>>> objects("package:ath1121501.db")
>>>     
>>  [1] "ath1121501"             "ath1121501ACCNUM"
>> "ath1121501ARACYC"       "ath1121501ARACYCENZYME" "ath1121501CHR"
>>  [6] "ath1121501CHRLOC"       "ath1121501ENZYME"
>> "ath1121501ENZYME2PROBE" "ath1121501GENENAME"     "ath1121501GO"
>> [11] "ath1121501GO2ALLPROBES" "ath1121501GO2PROBE"
>> "ath1121501MAPCOUNTS"    "ath1121501MULTIHIT"     "ath1121501ORGANISM"
>> [16] "ath1121501PATH"         "ath1121501PATH2PROBE"
>> "ath1121501PMID"         "ath1121501PMID2PROBE"   "ath1121501SYMBOL"
>> [21] "ath1121501_dbInfo"      "ath1121501_dbconn"
>> "ath1121501_dbfile"      "ath1121501_dbschema"
>>
>>   
>>> sessionInfo()
>>>     
>> R version 2.7.0 (2008-04-22)
>> i386-apple-darwin8.10.1
>>
>> locale:
>> en_IE.UTF-8/en_IE.UTF-8/C/C/en_IE.UTF-8/en_IE.UTF-8
>>
>> attached base packages:
>> [1] tools     stats     graphics  grDevices utils     datasets  methods
>> base
>>
>> other attached packages:
>>  [1] ath1121501.db_2.2.0  TinesATH1.db_1.0     geneplotter_1.18.0
>> annotate_1.18.0      xtable_1.5-2         AnnotationDbi_1.2.0
>>
>> 	[[alternative HTML version deleted]]
>>
>>   
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list