[Bioc-devel] read_bed()
    Shepherd, Lori 
    Lor|@Shepherd @end|ng |rom Ro@we||P@rk@org
       
    Tue Sep 17 13:02:18 CEST 2019
    
    
  
Please look at rtracklayer::import()  function that we recommend for reading of BAM files along with other common formats.
Cheers,
Lori Shepherd
Bioconductor Core Team
Roswell Park Cancer Institute
Department of Biostatistics & Bioinformatics
Elm & Carlton Streets
Buffalo, New York 14263
________________________________
From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of Bhagwat, Aditya <Aditya.Bhagwat using mpi-bn.mpg.de>
Sent: Tuesday, September 17, 2019 6:58 AM
To: bioc-devel using r-project.org <bioc-devel using r-project.org>
Subject: [Bioc-devel] read_bed()
Dear bioc-devel,
I had two feedback requests regarding the function read_bed().
1) Did I overlook, and therefore, re-invent existing functionality?
2) If not, would `read_bed` be suited for existence in a more foundational package, e.g. `GenomicRanges`, given the rather basal nature of this functionality?
It reads a bedfile into a GRanges, converts the coordinates from 0-based (bedfile) to 1-based (GRanges)<https://www.biostars.org/p/84686>, adds BSgenome info (to allow for implicit range validity checking<https://support.bioconductor.org/p/124250>) and plots the karyogram<https://support.bioconductor.org/p/124328>.
Thank you for your feedback.
Cheers,
Aditya
#' Read bedfile into GRanges
#'
#' @param bedfile        file path
#' @param bsgenome       BSgenome, e.g. BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' @param zero_based     logical(1): whether bedfile GRanges are 0-based
#' @param rm_duplicates  logical(1)
#' @param plot           logical(1)
#' @param verbose        logical(1)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @note By convention BED files are 0-based. GRanges are always 1-based.
#'       A good discussion on these two alternative codings is given
#'       by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
#' @examples
#' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' (gr <- read_bed(bedfile, bsgenome))
#' @importFrom  data.table  :=
#' @export
read_bed <- function(
    bedfile,
    bsgenome,
    zero_based    = TRUE,
    rm_duplicates = TRUE,
    plot          = TRUE,
    verbose       = TRUE
){
    # Assert
    assert_all_are_existing_files(bedfile)
    assert_is_a_bool(verbose)
    assert_is_a_bool(rm_duplicates)
    assert_is_a_bool(zero_based)
    # Comply
    seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL
    # Read
    if (verbose) cmessage('\tRead %s', bedfile)
    dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
            col.names = c('seqnames', 'start', 'end', 'strand'))
    data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))
    # Transform coordinates: 0-based -> 1-based
    if (zero_based){
        if (verbose)    cmessage('\t\tConvert 0 -> 1-based')
        dt[, start := start + 1]
    }
    if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
                            nrow(dt), length(unique(dt$seqnames)))
    # Drop duplicates
    if (rm_duplicates){
        is_duplicated <- cduplicated(dt)
        if (any(is_duplicated)){
            if (verbose) cmessage('\t\t        %d after removing duplicates')
            dt %<>% extract(!duplicated)
        }
    }
    # Turn into GRanges
    gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)
    # Plot and return
    title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
    if (plot) plot_karyogram(gr, title)
    gr
}
        [[alternative HTML version deleted]]
_______________________________________________
Bioc-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
	[[alternative HTML version deleted]]
    
    
More information about the Bioc-devel
mailing list