[Bioc-devel] request: high-level seqlevel utilities

Julian Gehring julian.gehring at embl.de
Mon Dec 30 14:18:18 CET 2013


Hi,

With the convenience that seqnamesStyles offers now, having to specify 
the chromosome name notation manually would feel like a step back.  In 
terms of subsetting genomic ranges, I normally think of four major 
groups of interest:

- Toplevel/standard: 1,..22,X,Y,MT
- Autosomes: 1,..,22
- Allosomes: X,Y
- "Linear": 1,..,22,X,Y

If you are concerned about confusing the user with many specialized 
functions, how about extending 'keepSeqlevels' by adding a e.g. 'group' 
argument that allows you to select a group of chromosomes as above.  As 
an example, think of:

## subset as before by seqname
keepSeqlevels(gr, "1")
keepSeqlevels(gr, value = "1")

## the new feature
keepSeqlevels(gr, group = "autosomes")

which would dispatch to specialized methods like '.keepAutosomes' in 
combination with 'seqnames.db'.  This way, one could also create a 
setting in which the groups can be easily extended by the user or other 
packages, by simply defining more of the specialized functions.

Best wishes
Julian



More information about the Bioc-devel mailing list