[Bioc-sig-seq] BED file parser

Vincent Carey stvjc at channing.harvard.edu
Wed Mar 9 03:26:34 CET 2011


2011/3/8 Thiago Yukio Kikuchi Oliveira <stratust at gmail.com>:
> Hi,
>
> Is there a BED file parser for R?

I suppose it depends on what you mean by "parser".  import() from the
rtracklayer package imports BED and constructs and populates a
RangedData object with the contents.  Here we look at a small bed file
in text,
start R, load rtracklayer, import the data, show the result, and show
the resources used.

bash-3.2$ head ~/junc716_20.bed
chr20   55658   64827   JUNC00000001    14      +       55658   64827
 255,0,0 2       27,25   0,9144
chr20   55662   64821   JUNC00000002    2       -       55662   64821
 255,0,0 2       34,8    0,9151
chr20   135774  147029  JUNC00000003    1       -       135774  147029
 255,0,0 2       8,29    0,11226
chr20   167951  172361  JUNC00000004    1       +       167951  172361
 255,0,0 2       29,8    0,4402
chr20   189824  192113  JUNC00000005    3       +       189824  192113
 255,0,0 2       33,9    0,2280
chr20   189829  192113  JUNC00000006    3       +       189829  192113
 255,0,0 2       32,9    0,2275
chr20   193930  199576  JUNC00000007    4       -       193930  199576
 255,0,0 2       28,11   0,5635
chr20   207050  207846  JUNC00000008    2       -       207050  207846
 255,0,0 2       20,34   0,762
chr20   218306  218925  JUNC00000009    1       -       218306  218925
 255,0,0 2       11,26   0,593
chr20   221160  225070  JUNC00000010    25      -       221160  225070
 255,0,0 2       29,9    0,3901
bash-3.2$ head ~/junc716_20.bed > ~/lit.bed
bash-3.2$ R213 --vanilla --quiet
> library(rtracklayer)
Loading required package: RCurl
Loading required package: bitops
> lit = import("~/lit.bed")
> lit
RangedData with 10 rows and 9 value columns across 1 space
         space           ranges |         name     score      strand thickStart
   <character>        <IRanges> |  <character> <numeric> <character>  <integer>
1        chr20 [ 55659,  64827] | JUNC00000001        14           +      55658
2        chr20 [ 55663,  64821] | JUNC00000002         2           -      55662
3        chr20 [135775, 147029] | JUNC00000003         1           -     135774
4        chr20 [167952, 172361] | JUNC00000004         1           +     167951
5        chr20 [189825, 192113] | JUNC00000005         3           +     189824
6        chr20 [189830, 192113] | JUNC00000006         3           +     189829
7        chr20 [193931, 199576] | JUNC00000007         4           -     193930
8        chr20 [207051, 207846] | JUNC00000008         2           -     207050
9        chr20 [218307, 218925] | JUNC00000009         1           -     218306
10       chr20 [221161, 225070] | JUNC00000010        25           -     221160
    thickEnd     itemRgb blockCount  blockSizes blockStarts
   <integer> <character>  <integer> <character> <character>
1      64827     #FF0000          2       27,25      0,9144
2      64821     #FF0000          2        34,8      0,9151
3     147029     #FF0000          2        8,29     0,11226
4     172361     #FF0000          2        29,8      0,4402
5     192113     #FF0000          2        33,9      0,2280
6     192113     #FF0000          2        32,9      0,2275
7     199576     #FF0000          2       28,11      0,5635
8     207846     #FF0000          2       20,34       0,762
9     218925     #FF0000          2       11,26       0,593
10    225070     #FF0000          2        29,9      0,3901

> sessionInfo()
R version 2.13.0 Under development (unstable) (2011-03-01 r54628)
Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rtracklayer_1.11.11 RCurl_1.5-0         bitops_1.0-4.1

loaded via a namespace (and not attached):
[1] BSgenome_1.19.4      Biobase_2.11.9       Biostrings_2.19.15
[4] GenomicRanges_1.3.23 IRanges_1.9.25       Matrix_0.999375-47
[7] XML_3.2-0            grid_2.13.0          lattice_0.19-17


>
>
> Thanks
>
>     /    Thiago Yukio Kikuchi Oliveira
> (=\
>   \=) Faculdade de Medicina de Ribeirão Preto
>    /   Laboratório de Genética Molecular e Bioinformática
>   /=) -----------------------------------------------------------------
> (=/   Centro de Terapia Celular/CEPID/FAPESP - Hemocentro de Rib. Preto
>   /    Rua Tenente Catão Roxo, 2501 CEP 14151-140
> (=\   Ribeirão Preto - São Paulo
>   \=) Fone: 55 16 2101-9300   Ramal: 9603
>    /   E-mail: stratus at lgmb.fmrp.usp.br
>   /=)            stratust at gmail.com
> (=/
>   /    Bioinformatic Team - BiT: http://lgmb.fmrp.usp.br
> (=\   Hemocentro de Ribeirão Preto: http://pegasus.fmrp.usp.br
>   \=)
>    /  -----------------------------------------------------------------
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



More information about the Bioc-sig-sequencing mailing list