[BioC] Error in solveUserSEW0 when using get.targets in TEQC

Hervé Pagès hpages at fhcrc.org
Wed Mar 9 03:24:11 CET 2011


Hi Johanna,

As you can see, line 175279 in the 2.1M_Human_Exome.bed file is 
different from the other lines:

   > read.delim("2.1M_Human_Exome.bed", as.is=TRUE, skip=175277, 
nrows=5, header=FALSE)
                                                              V1 
V2       V3
   1                                                        chrY 
26179378 26179600
   2                                                        chrY 
26179986 26180066
   3 track name=tiled_region description=NimbleGen Tiled Regions 
NA       NA
   4                                                        chr1 
58924    59145
   5                                                        chr1 
59087    59167

I think that this how tracks are separated in a BED file.

I don't know the details of how the TEQC::get.targets() function should
be used but it seems that it is currently unable to load a BED file
that contains more than 1 track. You could still use the 'skip' and
'nrows' argument to load only the track you want though. For example,
to load the first track:

   > aa <- get.targets("2.1M_Human_Exome.bed", chrcol = 1, startcol = 2, 
endcol = 3, nrows=175278)
   [1] "read 175278 (non-overlapping) target regions"

But that requires that you have a way to find out the right values to
use for 'skip' and 'nrows'.

A more convenient way to achieve the same result is to load the file
with the import() function defined in the rtracklayer package:

   > library(rtracklayer)
   > bb <- import("2.1M_Human_Exome.bed", format="bed")
   > bb
   RangedDataList of length 2
   names(2): target_region tiled_region

Then if you want the first track:

   > as(bb[[1L]], "RangedData")

This will give you the same thing as 'aa'.

Finally note that TEQC has not been released yet. It's in BioC devel
(BioC 2.8) and you are using BioC release (BioC 2.7). This is very
likely to bring you troubles.

Cheers,
H.


On 03/03/2011 08:08 AM, Johanna Hasmats wrote:
> Dear all,
>
> I am trying to load the exome bed file "2.1M_Human_Exome.bed" provided by
> NimbleGen
> (http://www.nimblegen.com/downloads/annotation/seqcap_exome/2.1M_Human_Exome
> _Annotation.zip) as a target by using get.targets in the TEQC package.
>
> However, I get this error message:
>
> Error in solveUserSEW0(start = start, end = end, width = width) :
>    solving row 175279: range cannot be determined from the supplied arguments
> (too many NAs)
>
> and I cannot seem to come around it. Tried if conditions as well as
> na.omit/exclude but without results.
>
>> traceback()
> 4: .Call("solve_user_SEW0", start, end, width, PACKAGE = "IRanges")
> 3: solveUserSEW0(start = start, end = end, width = width)
> 2: IRanges(start = dat[, startcol], end = dat[, endcol])
> 1: get.targets("2.1M_Human_Exome.bed", chrcol = 1, startcol = 2,
>         endcol = 3)
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] TEQC_0.99.2   IRanges_1.8.9
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.10.0 tools_2.12.2
>
> Could somebody help me out with this?
>
> Thanks in advance,
>
> Johanna
>
>
> ############################################################
> Kindly
>
> Johanna Hasmats, M.Sc, PhD Student
> Division of Gene Technology, School of Biotechnology, KTH
> SciLifeLab
> Phone: +46(0)73 625 14 60
>
> Postal address:
> Box 1031
> 171 21 Solna
>
> Delivery address:
> KISP (Karolinska Institutet Science Park)
> Tomtebodavägen 23B
> 171 65 Solna
> Sweden
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list