From delhomme at embl.de Tue May 1 14:28:29 2012
From: delhomme at embl.de (Nicolas Delhomme)
Date: Tue, 1 May 2012 14:28:29 +0200
Subject: [Bioc-devel] Changes in the %in% function for DNAStringSet?
Message-ID: <06B18C11-D3F8-41E5-B526-A14C30AD7A78@embl.de>
Hi all,
In R 2.15.0, Bioc 2.10, the following works:
library(Biostrings)
c("TTGCGA","ATGGCT","ACACTG") %in% DNAStringSet(c("TTGCGA","ATGRCT","ACASTG"))
[1] TRUE FALSE FALSE
> sessionInfo()R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] stats4_2.15.0
While in Bioc 2.11 it fails:
Error in match(x, table, nomatch = 0L) :
'match' requires vector arguments
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.25.3 IRanges_1.15.7 BiocGenerics_0.3.0
loaded via a namespace (and not attached):
[1] stats4_2.15.0
I'd just like to know if that is that a change of API or not. If yes, I'd need to adapt my code that currently fails building.
Cheers,
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
From delhomme at embl.de Wed May 2 10:52:19 2012
From: delhomme at embl.de (Nicolas Delhomme)
Date: Wed, 2 May 2012 10:52:19 +0200
Subject: [Bioc-devel] biomaRt cannot list marts when going through a mirror
web site
Message-ID: <295F35D4-084E-4735-B405-9F1F8135182E@embl.de>
Hi Steffen, hi Wolfgang,
When trying to list the marts available from an ensembl mirror, I get the following:
listMarts(host="uswest.ensembl.org")
Space required after the Public Identifier
SystemLiteral " or ' expected
SYSTEM or PUBLIC, the URI is missing
Error: 1: Space required after the Public Identifier
2: SystemLiteral " or ' expected
3: SYSTEM or PUBLIC, the URI is missing
This is triggered by this line:
registry = bmRequest(request = request, ssl.verifypeer = ssl.verifypeer, verbose = verbose)
in the listMarts function.
Looking at the bmRequest function, it uses the getURL function of the RCurl package. This function is the culprit:
## the request as computed by listMarts
request = "http://uswest.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt"
getURL(request, ssl.verifypeer = TRUE)
[1] "\n
\n302 Found\n\nFound
\nThe document has moved here.
\n\n"
As you an see it returns a 302 relocation page, i.e. the website is mirrored to "www.ensembl.org" in my case.
Adding a followlocation=TRUE argument to that command solves the problem:
getURL(request, ssl.verifypeer = TRUE, followlocation=TRUE)
[1] "\n\n \n \n
References: <295F35D4-084E-4735-B405-9F1F8135182E@embl.de>
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From hpages at fhcrc.org Fri May 4 03:56:13 2012
From: hpages at fhcrc.org (=?ISO-8859-1?Q?Herv=E9_Pag=E8s?=)
Date: Thu, 03 May 2012 18:56:13 -0700
Subject: [Bioc-devel] Changes in the %in% function for DNAStringSet?
In-Reply-To: <06B18C11-D3F8-41E5-B526-A14C30AD7A78@embl.de>
References: <06B18C11-D3F8-41E5-B526-A14C30AD7A78@embl.de>
Message-ID: <4FA3373D.4070304@fhcrc.org>
Hi Nico,
Last week I did some improvements/reorganization of the match(),
%in%, duplicated(), and unique() stuff in
IRanges/GenomicRanges/Biostrings, and apparently forgot to define the
"%in%" method
for ANY,Vector. Thanks for the catch!
This is fixed in IRanges 1.15.8. FWIW I also added an "%in%" method
for Vector,ANY so now this works too:
> DNAStringSet(c("TTGCGA","ATGRCT","ACASTG")) %in%
c("TTGCGA","ATGGCT","ACACTG")
[1] TRUE FALSE FALSE
It is so sad that we have to redefine "%in%" methods that do exactly
the same thing as base::`%in%`:
> base::`%in%`
function (x, table)
match(x, table, nomatch = 0L) > 0L
just because base::`%in%` cannot dispatch on the appropriate
"match" method. A well-known issue of the way generics, methods
and NAMESPACE interact with each other... but still an unfortunate
one.
The good news is that we have on our TODO list to explicitly define
the match() and %in% generics in BiocGenerics so there will be an
opportunity to overwrite the "%in%" default method:
setMethod("%in%", c("ANY", "ANY"), function (x, table) match(x,
table, nomatch = 0L) > 0L)
(I'm still hesitant about this though. What could be the drawbacks
of overwriting the default method?)
Also last week at the same time I did the changes on match() and
family, I also reimplemented the "match" method for DNAStringSet
objects (which is called when either 'x' or 'table' or both are
DNAStringSet). The new implementation is in Biostrings 2.25.3.
It uses a hash-based algorithm instead of the quicksort-based algo
that was used so far. The resulting speedup varies a lot depending
on the sizes of 'x' and 'table', and will typically be important
(10x or more) for big (i.e. > 1M elements) DNAStringSet objects.
This benefits directly %in%, duplicated() and unique() on
DNAStringSet objects.
With Biostrings 2.25.3 (Bioc 2.11):
> library(Biostrings)
> probes <- DNAStringSet(hgu133aprobe)
> system.time(isdup <- duplicated(probes))
user system elapsed
0.048 0.000 0.050
With Bioc <= 2.10:
> system.time(isdup <- duplicated(probes))
user system elapsed
0.232 0.000 0.233
Finally I should mention that, even though the hash function I use
for DNAStringSet (and RNAStringSet, AAStringSet, BStringSet) is the
same as the function used in base R for hashing the strings of a
standard character vector, calling match(), %in%, duplicated() or
unique() on a standard character vector is still slightly faster
(2x) than on a DNAStringSet. This can probably be explained by the
fact that all the strings in all the character vectors defined in
a session are pre-hashed i.e. hashed the 1st time the string is
created and the result of the hash stored in the "global CHARSXP
hash table".
Cheers,
H.
On 05/01/2012 05:28 AM, Nicolas Delhomme wrote:
> Hi all,
>
> In R 2.15.0, Bioc 2.10, the following works:
>
> library(Biostrings)
> c("TTGCGA","ATGGCT","ACACTG") %in% DNAStringSet(c("TTGCGA","ATGRCT","ACASTG"))
> [1] TRUE FALSE FALSE
>> sessionInfo()R version 2.15.0 (2012-03-30)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] stats4_2.15.0
>
>
> While in Bioc 2.11 it fails:
>
> Error in match(x, table, nomatch = 0L) :
> 'match' requires vector arguments
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Biostrings_2.25.3 IRanges_1.15.7 BiocGenerics_0.3.0
>
> loaded via a namespace (and not attached):
> [1] stats4_2.15.0
>
>
>
> I'd just like to know if that is that a change of API or not. If yes, I'd need to adapt my code that currently fails building.
>
> Cheers,
>
> Nico
>
> ---------------------------------------------------------------
> Nicolas Delhomme
>
> Genome Biology Computational Support
>
> European Molecular Biology Laboratory
>
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
From delhomme at embl.de Fri May 4 09:31:40 2012
From: delhomme at embl.de (Nicolas Delhomme)
Date: Fri, 4 May 2012 09:31:40 +0200
Subject: [Bioc-devel] Changes in the %in% function for DNAStringSet?
In-Reply-To: <4FA3373D.4070304@fhcrc.org>
References: <06B18C11-D3F8-41E5-B526-A14C30AD7A78@embl.de>
<4FA3373D.4070304@fhcrc.org>
Message-ID: <237F59AA-B33D-492C-AE9A-A5C007EF79FA@embl.de>
Hi Herv?,
Thanks a lot for fixing it and for the super detailed description! Learned a lot :-) And thanks for the benchmarking, that's really useful as well!
I can't really think of any drawbacks there, but my %in% usage is certainly limited. What do the R developer guys say about it? Wouldn't it make sense to have it that way in base R?
Cheers,
Nico
On May 4, 2012, at 3:56 AM, Herv? Pag?s wrote:
> Hi Nico,
>
> Last week I did some improvements/reorganization of the match(),
> %in%, duplicated(), and unique() stuff in IRanges/GenomicRanges/Biostrings, and apparently forgot to define the "%in%" method
> for ANY,Vector. Thanks for the catch!
>
> This is fixed in IRanges 1.15.8. FWIW I also added an "%in%" method
> for Vector,ANY so now this works too:
>
> > DNAStringSet(c("TTGCGA","ATGRCT","ACASTG")) %in% c("TTGCGA","ATGGCT","ACACTG")
> [1] TRUE FALSE FALSE
>
>
>
> It is so sad that we have to redefine "%in%" methods that do exactly
> the same thing as base::`%in%`:
>
> > base::`%in%`
> function (x, table)
> match(x, table, nomatch = 0L) > 0L
>
>
> just because base::`%in%` cannot dispatch on the appropriate
> "match" method. A well-known issue of the way generics, methods
> and NAMESPACE interact with each other... but still an unfortunate
> one.
>
>
>
> The good news is that we have on our TODO list to explicitly define
> the match() and %in% generics in BiocGenerics so there will be an
> opportunity to overwrite the "%in%" default method:
>
> setMethod("%in%", c("ANY", "ANY"), function (x, table) match(x, table, nomatch = 0L) > 0L)
>
> (I'm still hesitant about this though. What could be the drawbacks
> of overwriting the default method?)
>
> Also last week at the same time I did the changes on match() and
> family, I also reimplemented the "match" method for DNAStringSet
> objects (which is called when either 'x' or 'table' or both are
> DNAStringSet). The new implementation is in Biostrings 2.25.3.
> It uses a hash-based algorithm instead of the quicksort-based algo
> that was used so far. The resulting speedup varies a lot depending
> on the sizes of 'x' and 'table', and will typically be important
> (10x or more) for big (i.e. > 1M elements) DNAStringSet objects.
>
> This benefits directly %in%, duplicated() and unique() on
> DNAStringSet objects.
>
> With Biostrings 2.25.3 (Bioc 2.11):
>
> > library(Biostrings)
> > probes <- DNAStringSet(hgu133aprobe)
> > system.time(isdup <- duplicated(probes))
> user system elapsed
> 0.048 0.000 0.050
>
> With Bioc <= 2.10:
>
> > system.time(isdup <- duplicated(probes))
> user system elapsed
> 0.232 0.000 0.233
>
> Finally I should mention that, even though the hash function I use
> for DNAStringSet (and RNAStringSet, AAStringSet, BStringSet) is the
> same as the function used in base R for hashing the strings of a
> standard character vector, calling match(), %in%, duplicated() or
> unique() on a standard character vector is still slightly faster
> (2x) than on a DNAStringSet. This can probably be explained by the
> fact that all the strings in all the character vectors defined in
> a session are pre-hashed i.e. hashed the 1st time the string is
> created and the result of the hash stored in the "global CHARSXP
> hash table".
>
> Cheers,
> H.
>
> On 05/01/2012 05:28 AM, Nicolas Delhomme wrote:
>> Hi all,
>>
>> In R 2.15.0, Bioc 2.10, the following works:
>>
>> library(Biostrings)
>> c("TTGCGA","ATGGCT","ACACTG") %in% DNAStringSet(c("TTGCGA","ATGRCT","ACASTG"))
>> [1] TRUE FALSE FALSE
>>> sessionInfo()R version 2.15.0 (2012-03-30)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
>>
>> loaded via a namespace (and not attached):
>> [1] stats4_2.15.0
>>
>>
>> While in Bioc 2.11 it fails:
>>
>> Error in match(x, table, nomatch = 0L) :
>> 'match' requires vector arguments
>>> sessionInfo()
>> R version 2.15.0 (2012-03-30)
>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>
>> locale:
>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] Biostrings_2.25.3 IRanges_1.15.7 BiocGenerics_0.3.0
>>
>> loaded via a namespace (and not attached):
>> [1] stats4_2.15.0
>>
>>
>>
>> I'd just like to know if that is that a change of API or not. If yes, I'd need to adapt my code that currently fails building.
>>
>> Cheers,
>>
>> Nico
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> Genome Biology Computational Support
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8310
>> Email: nicolas.delhomme at embl.de
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> --
> Herv? Pag?s
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
From rflight79 at gmail.com Fri May 4 18:09:02 2012
From: rflight79 at gmail.com (Robert M. Flight)
Date: Fri, 4 May 2012 12:09:02 -0400
Subject: [Bioc-devel] add sessionInfo() option to "save"
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From D.Strbenac at garvan.org.au Mon May 7 07:00:05 2012
From: D.Strbenac at garvan.org.au (Dario Strbenac)
Date: Mon, 7 May 2012 15:00:05 +1000 (EST)
Subject: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
Message-ID: <20120507150005.BWX83078@gimr.garvan.unsw.edu.au>
Hi,
It seems some data has been added to the EST data table in UCSC that GenomicFeatures cannot parse.
> ESTs <- makeFeatureDbFromUCSC(genome = "hg19", track = "est", tablename = "all_est")
Download the all_est table ... Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 4587019 did not have 22 elements
The relevant lines of the session information are :
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)
other attached packages:
[1] GenomicFeatures_1.8.1
- Dario.
From tim.triche at gmail.com Mon May 7 07:36:53 2012
From: tim.triche at gmail.com (Tim Triche, Jr.)
Date: Sun, 6 May 2012 22:36:53 -0700
Subject: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
In-Reply-To: <20120507150005.BWX83078@gimr.garvan.unsw.edu.au>
References: <20120507150005.BWX83078@gimr.garvan.unsw.edu.au>
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From setia.pramana at ki.se Mon May 7 09:30:43 2012
From: setia.pramana at ki.se (Setia Pramana)
Date: Mon, 7 May 2012 07:30:43 +0000
Subject: [Bioc-devel] KEGG.db
Message-ID: <517A5DC08F772349AB76C48A25FC0B61C8F140@KIMSX01.user.ki.se>
Hi All,
I am developing a new package using the info from KEGG.db. I used the following command to map KEGG pathway identifiers to Entrez Gene:
mapped.genes <-as.list(KEGGPATHID2EXTID)
When I run the function as an R package, I have the following error msg:
Error in as.list.default(KEGGPATHID2EXTID) :
no method for coercing this S4 class to a vector
However when I run not as a package (like normal R function), the function works well.
Please help me to find out what may be the problem.
Thank you in advance for your help.
Best,
Setia
MEB KI Stockholm
From willem.ligtenberg at openanalytics.eu Mon May 7 09:38:19 2012
From: willem.ligtenberg at openanalytics.eu (Willem Ligtenberg)
Date: Mon, 7 May 2012 09:38:19 +0200
Subject: [Bioc-devel] KEGG.db
In-Reply-To: <517A5DC08F772349AB76C48A25FC0B61C8F140@KIMSX01.user.ki.se>
References: <517A5DC08F772349AB76C48A25FC0B61C8F140@KIMSX01.user.ki.se>
Message-ID:
Hi,
Although I am not sure if you should be using the KEGG.db package any
more, since it is deprecated.
See the following message when you load the KEGG.db package:
KEGG.db contains mappings based on older data because the original
resource was removed from the the public domain before the most
recent update was produced. This package should now be considered
deprecated and future versions of Bioconductor may not have it
available. One possible alternative to consider is to look at the
reactome.db package
You should make sure, your package uses the right as.list method. You
can do this by using:
AnnotationDbi::as.list instead of just as.list. (This specifies the
package from which it should load the function.)
Kind regards,
Willem
On Mon, May 7, 2012 at 9:30 AM, Setia Pramana wrote:
> Hi All,
>
> I am developing a new package using the info from KEGG.db. I used the following command to map ?KEGG pathway identifiers to Entrez Gene:
>
> ?mapped.genes <-as.list(KEGGPATHID2EXTID)
>
> When I run the function ?as an R package, I have the following error msg:
>
> Error in as.list.default(KEGGPATHID2EXTID) :
> ?no method for coercing this S4 class to a vector
>
> However when I run not as a package (like normal R function), the function works well.
>
> Please help me to find out what may be the problem.
> Thank you in advance for your help.
>
> Best,
> Setia
> MEB KI Stockholm
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
From D.Strbenac at garvan.org.au Tue May 8 02:00:09 2012
From: D.Strbenac at garvan.org.au (Dario Strbenac)
Date: Tue, 8 May 2012 10:00:09 +1000 (EST)
Subject: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
In-Reply-To:
References: <20120507150005.BWX83078@gimr.garvan.unsw.edu.au>
Message-ID: <20120508100009.BWX96593@gimr.garvan.unsw.edu.au>
I e-mailed UCSC and they said the preferred way is to download by FTP. Which means more lines of code to parse the text file into columns, then split the exons and widths columns up to be able to make GRanges.
---- Original message ----
>Date: Sun, 6 May 2012 22:36:53 -0700
>From: "Tim Triche, Jr."
>Subject: Re: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
>To: D.Strbenac at garvan.org.au
>Cc: bioc-devel at r-project.org
>
> Actually, try downloading the same thing from the
> Table Browser and see if there isn't something like
> the following at the tail of the file:
> 843 chr11 33910774 33910775 rs4756078 0 + CC
> C/G/T genomic single
> by-cluster,by-frequency,by-2hit-2allele,by-hapmap,by-1000genomes
> 0.361204 0.223906 intron exact 1
> SingleClassTriA---------------------------------------------------------------------------
> procedures have exceeded timeout: 1200 seconds,
> function has ended.
> ---------------------------------------------------------------------------
> (this is from my attempted download of the
> snp135common track, but it appears to be happening
> to you as well)
> It would appear that we're being throttled.
> On Sun, May 6, 2012 at 10:00 PM, Dario Strbenac
> wrote:
>
> Hi,
>
> It seems some data has been added to the EST data
> table in UCSC that GenomicFeatures cannot parse.
>
> > ESTs <- makeFeatureDbFromUCSC(genome = "hg19",
> track = "est", tablename = "all_est")
> Download the all_est table ... Error in scan(file,
> what, nmax, sep, dec, quote, skip, nlines,
> na.strings, ?:
> ?line 4587019 did not have 22 elements
>
> The relevant lines of the session information are
> :
>
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> other attached packages:
> [1] GenomicFeatures_1.8.1
>
> - Dario.
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> --
> A model is a lie that helps you see the truth.
> Howard Skipper
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
From tim.triche at gmail.com Tue May 8 03:28:21 2012
From: tim.triche at gmail.com (Tim Triche, Jr.)
Date: Mon, 7 May 2012 18:28:21 -0700
Subject: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
In-Reply-To: <20120508100009.BWX96593@gimr.garvan.unsw.edu.au>
References: <20120507150005.BWX83078@gimr.garvan.unsw.edu.au>
<20120508100009.BWX96593@gimr.garvan.unsw.edu.au>
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From sgado at science.unitn.it Tue May 8 16:02:12 2012
From: sgado at science.unitn.it (=?iso-8859-1?Q?Paola_Sgad=F2?=)
Date: Tue, 8 May 2012 16:02:12 +0200
Subject: [Bioc-devel] technical and biological replicates in the same
Exprset - Agi4x44
Message-ID: <21AABEF5-AB40-4296-9EB7-C3FB573D3EAF@science.unitn.it>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From mcarlson at fhcrc.org Tue May 8 20:24:56 2012
From: mcarlson at fhcrc.org (Marc Carlson)
Date: Tue, 08 May 2012 11:24:56 -0700
Subject: [Bioc-devel] KEGG.db
In-Reply-To:
References: <517A5DC08F772349AB76C48A25FC0B61C8F140@KIMSX01.user.ki.se>
Message-ID: <4FA964F8.8030305@fhcrc.org>
Willem is right. The as.list() method you want is the one from
AnnotationDbi. Other as.list methods will not know what to do with the
bimap object in question. So in the usage context you are dexcribing,
you may need to import that method in your NAMESPACE file so that your
package knows about it.
Also, the KEGG.db package has not been able to be updated for over a
year now. The reactome.db package is probably a good alternative.
Marc
On 05/07/2012 12:38 AM, Willem Ligtenberg wrote:
> Hi,
>
> Although I am not sure if you should be using the KEGG.db package any
> more, since it is deprecated.
> See the following message when you load the KEGG.db package:
> KEGG.db contains mappings based on older data because the original
> resource was removed from the the public domain before the most
> recent update was produced. This package should now be considered
> deprecated and future versions of Bioconductor may not have it
> available. One possible alternative to consider is to look at the
> reactome.db package
>
> You should make sure, your package uses the right as.list method. You
> can do this by using:
> AnnotationDbi::as.list instead of just as.list. (This specifies the
> package from which it should load the function.)
>
> Kind regards,
>
> Willem
>
> On Mon, May 7, 2012 at 9:30 AM, Setia Pramana wrote:
>> Hi All,
>>
>> I am developing a new package using the info from KEGG.db. I used the following command to map KEGG pathway identifiers to Entrez Gene:
>>
>> mapped.genes<-as.list(KEGGPATHID2EXTID)
>>
>> When I run the function as an R package, I have the following error msg:
>>
>> Error in as.list.default(KEGGPATHID2EXTID) :
>> no method for coercing this S4 class to a vector
>>
>> However when I run not as a package (like normal R function), the function works well.
>>
>> Please help me to find out what may be the problem.
>> Thank you in advance for your help.
>>
>> Best,
>> Setia
>> MEB KI Stockholm
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
From thomas.girke at ucr.edu Thu May 10 06:53:22 2012
From: thomas.girke at ucr.edu (Thomas Girke)
Date: Wed, 9 May 2012 21:53:22 -0700
Subject: [Bioc-devel] FastqStreamer error in function context
Message-ID: <20120510045322.GA4102@Thomas-Girkes-MacBook-Pro.local>
When FastqStreamer or FastqSampler are called within another function in
combination with a writeFastq step then this usually returns an error.
However, the same code runs just fine outside of a function. Below is
an example to reproduce this error.
A small feature request for FastqStreamer would be an option to return
the total number of reads stored in a fastq file as well as an option
for accessing specific records by passing on an index vector.
Best,
Thomas
Here is an example:
library(ShortRead)
sp <- SolexaPath(system.file('extdata', package='ShortRead'))
fl <- file.path(analysisPath(sp), "s_1_sequence.txt")
## Some function using FastqStreamer
test <- function(x=fl) {
f <- FastqStreamer(x, 5)
while (length(fq <- yield(f))) {
fqsub <- fq[1:2]
writeFastq(fqsub, "test.fastq", mode="a")
}
close(f)
}
test(x=fl)
Error in .IRanges.checkAndTranslateSingleBracketSubscript(x, i) :
subscript contains NAs or out of bounds indices
sessionInfo()
R version 2.15.0 alpha (2012-03-05 r58604)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.14.3 latticeExtra_0.6-19 RColorBrewer_1.0-5
[4] Rsamtools_1.8.4 lattice_0.20-6 Biostrings_2.24.1
[7] GenomicRanges_1.8.4 IRanges_1.14.2 BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.0 hwriter_1.3 stats4_2.15.0
[6] tools_2.15.0 zlibbioc_1.2.0
From mtmorgan at fhcrc.org Thu May 10 07:32:58 2012
From: mtmorgan at fhcrc.org (Martin Morgan)
Date: Wed, 09 May 2012 22:32:58 -0700
Subject: [Bioc-devel] FastqStreamer error in function context
In-Reply-To: <20120510045322.GA4102@Thomas-Girkes-MacBook-Pro.local>
References: <20120510045322.GA4102@Thomas-Girkes-MacBook-Pro.local>
Message-ID: <4FAB530A.5040407@fhcrc.org>
On 05/09/2012 09:53 PM, Thomas Girke wrote:
> When FastqStreamer or FastqSampler are called within another function in
> combination with a writeFastq step then this usually returns an error.
> However, the same code runs just fine outside of a function. Below is
> an example to reproduce this error.
Hi Thomas --
The example below fails because there are 256 records in the file, so
for me the 52nd yield() returns length(fq) == 1 and the subset '2' is
out of bounds. But maybe there is another example?
> A small feature request for FastqStreamer would be an option to return
> the total number of reads stored in a fastq file as well as an option
> for accessing specific records by passing on an index vector.
For the first part, after the fact we have
> f
class: FastqStreamer
file: s_1_sequence.txt
status: n=5 current=1 added=256 total=256
with 'total=256' indicating that the streamer iterated over (i.e., the
file had) 256 records. This is actually accessible in the reference
class using the not-really-public (see the last lines of
example(FastqStreamer)) accessor
> f$status()
n current added total
5 1 256 256
which is a named integer vector. Is this what you were looking for?
I'll give the idea about selecting specific records some thought; I see
how it could be useful.
Martin
>
> Best,
>
> Thomas
>
>
> Here is an example:
>
> library(ShortRead)
> sp<- SolexaPath(system.file('extdata', package='ShortRead'))
> fl<- file.path(analysisPath(sp), "s_1_sequence.txt")
>
> ## Some function using FastqStreamer
> test<- function(x=fl) {
> f<- FastqStreamer(x, 5)
> while (length(fq<- yield(f))) {
> fqsub<- fq[1:2]
> writeFastq(fqsub, "test.fastq", mode="a")
> }
> close(f)
> }
> test(x=fl)
>
> Error in .IRanges.checkAndTranslateSingleBracketSubscript(x, i) :
> subscript contains NAs or out of bounds indices
>
>
> sessionInfo()
> R version 2.15.0 alpha (2012-03-05 r58604)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ShortRead_1.14.3 latticeExtra_0.6-19 RColorBrewer_1.0-5
> [4] Rsamtools_1.8.4 lattice_0.20-6 Biostrings_2.24.1
> [7] GenomicRanges_1.8.4 IRanges_1.14.2 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.0 hwriter_1.3 stats4_2.15.0
> [6] tools_2.15.0 zlibbioc_1.2.0
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
From liqigang at gmail.com Thu May 10 08:08:21 2012
From: liqigang at gmail.com (li)
Date: Thu, 10 May 2012 14:08:21 +0800
Subject: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
Message-ID:
Dario Strbenac ???
>I e-mailed UCSC and they said the preferred way is to download by FTP. Which means more lines of code to parse the text file into columns, then split the exons and widths columns up to be able to make GRanges.
>
>---- Original message ----
>>Date: Sun, 6 May 2012 22:36:53 -0700
>>From: "Tim Triche, Jr."
>>Subject: Re: [Bioc-devel] GenomicFeatures FeatureDB EST Table Error
>>To: D.Strbenac at garvan.org.au
>>Cc: bioc-devel at r-project.org
>>
>> Actually, try downloading the same thing from the
>> Table Browser and see if there isn't something like
>> the following at the tail of the file:
>> 843 chr11 33910774 33910775 rs4756078 0 + CC
>> C/G/T genomic single
>> by-cluster,by-frequency,by-2hit-2allele,by-hapmap,by-1000genomes
>> 0.361204 0.223906 intron exact 1
>> SingleClassTriA---------------------------------------------------------------------------
>> procedures have exceeded timeout: 1200 seconds,
>> function has ended.
>> ---------------------------------------------------------------------------
>> (this is from my attempted download of the
>> snp135common track, but it appears to be happening
>> to you as well)
>> It would appear that we're being throttled.
>> On Sun, May 6, 2012 at 10:00 PM, Dario Strbenac
>> wrote:
>>
>> Hi,
>>
>> It seems some data has been added to the EST data
>> table in UCSC that GenomicFeatures cannot parse.
>>
>> > ESTs <- makeFeatureDbFromUCSC(genome = "hg19",
>> track = "est", tablename = "all_est")
>> Download the all_est table ... Error in scan(file,
>> what, nmax, sep, dec, quote, skip, nlines,
>> na.strings, ?:
>> ?line 4587019 did not have 22 elements
>>
>> The relevant lines of the session information are
>> :
>>
>> R version 2.15.0 (2012-03-30)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> other attached packages:
>> [1] GenomicFeatures_1.8.1
>>
>> - Dario.
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>> --
>> A model is a lie that helps you see the truth.
>> Howard Skipper
>
>
>--------------------------------------
>Dario Strbenac
>Research Assistant
>Cancer Epigenetics
>Garvan Institute of Medical Research
>Darlinghurst NSW 2010
>Australia
>
>_______________________________________________
>Bioc-devel at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/bioc-devel
From w.vanwieringen at vumc.nl Thu May 10 11:15:20 2012
From: w.vanwieringen at vumc.nl (Wieringen, W.N. van)
Date: Thu, 10 May 2012 09:15:20 +0000
Subject: [Bioc-devel] build
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From beniltoncarvalho at gmail.com Thu May 10 12:38:19 2012
From: beniltoncarvalho at gmail.com (Benilton Carvalho)
Date: Thu, 10 May 2012 11:38:19 +0100
Subject: [Bioc-devel] build
In-Reply-To:
References:
Message-ID:
Your release version is in sync with the svn copy (release branch)...
so everything is fine there:
http://bioconductor.org/packages/2.10/bioc/html/sigaR.html
Similarly, your devel version is in sync with the svn (devel
branch)... everything is fine there as well:
http://bioconductor.org/packages/2.11/bioc/html/sigaR.html
The changes you've made to your package that resulted in version 1.1.0
will appear in the release branch on the next BioC release.
b
On 10 May 2012 10:15, Wieringen, W.N. van wrote:
> Dear all,
>
>
> A week ago I extended the functionality of my package (sigaR). This was done by addition of some new files in the R directory of the package and modifications in related files. The new files did indeed arrive at the Bioconductor. Also the new version of the package builds without error. However, the new build on Bioconductor does not include the new functionality, whereas on my computer the new version builds (and checks) without error and yields the new functionality. ?Does anyone have a clue what could be the problem? Thanks in advance for any help.
>
> Best wishes,
>
>
> Wessel
>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
From thomas.girke at ucr.edu Fri May 11 05:05:03 2012
From: thomas.girke at ucr.edu (Thomas Girke)
Date: Thu, 10 May 2012 20:05:03 -0700
Subject: [Bioc-devel] FastqStreamer error in function context
In-Reply-To:
References: <20120510045322.GA4102@Thomas-Girkes-MacBook-Pro.local>
Message-ID: <20120511030503.GA18342@biocluster.ucr.edu>
Martin,
There is indeed no problem with those functions, I just had a typo in
my code. I guess I shouldn't send out bug reports when it is well past
my bed time. Sorry for the false alarm.
I love the streaming functionality. It really brings NGS analysis back
to low memory systems, such as laptops or outdated cluster nodes, without
the inconviences of constantly splitting large files.
Best,
Thomas
On Thu, May 10, 2012 at 05:32:58AM +0000, Martin Morgan wrote:
> On 05/09/2012 09:53 PM, Thomas Girke wrote:
> > When FastqStreamer or FastqSampler are called within another function in
> > combination with a writeFastq step then this usually returns an error.
> > However, the same code runs just fine outside of a function. Below is
> > an example to reproduce this error.
>
> Hi Thomas --
>
> The example below fails because there are 256 records in the file, so
> for me the 52nd yield() returns length(fq) == 1 and the subset '2' is
> out of bounds. But maybe there is another example?
>
> > A small feature request for FastqStreamer would be an option to return
> > the total number of reads stored in a fastq file as well as an option
> > for accessing specific records by passing on an index vector.
>
> For the first part, after the fact we have
>
> > f
> class: FastqStreamer
> file: s_1_sequence.txt
> status: n=5 current=1 added=256 total=256
>
> with 'total=256' indicating that the streamer iterated over (i.e., the
> file had) 256 records. This is actually accessible in the reference
> class using the not-really-public (see the last lines of
> example(FastqStreamer)) accessor
>
> > f$status()
> n current added total
> 5 1 256 256
>
> which is a named integer vector. Is this what you were looking for?
>
> I'll give the idea about selecting specific records some thought; I see
> how it could be useful.
>
> Martin
>
> >
> > Best,
> >
> > Thomas
> >
> >
> > Here is an example:
> >
> > library(ShortRead)
> > sp<- SolexaPath(system.file('extdata', package='ShortRead'))
> > fl<- file.path(analysisPath(sp), "s_1_sequence.txt")
> >
> > ## Some function using FastqStreamer
> > test<- function(x=fl) {
> > f<- FastqStreamer(x, 5)
> > while (length(fq<- yield(f))) {
> > fqsub<- fq[1:2]
> > writeFastq(fqsub, "test.fastq", mode="a")
> > }
> > close(f)
> > }
> > test(x=fl)
> >
> > Error in .IRanges.checkAndTranslateSingleBracketSubscript(x, i) :
> > subscript contains NAs or out of bounds indices
> >
> >
> > sessionInfo()
> > R version 2.15.0 alpha (2012-03-05 r58604)
> > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > other attached packages:
> > [1] ShortRead_1.14.3 latticeExtra_0.6-19 RColorBrewer_1.0-5
> > [4] Rsamtools_1.8.4 lattice_0.20-6 Biostrings_2.24.1
> > [7] GenomicRanges_1.8.4 IRanges_1.14.2 BiocGenerics_0.2.0
> >
> > loaded via a namespace (and not attached):
> > [1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.0 hwriter_1.3 stats4_2.15.0
> > [6] tools_2.15.0 zlibbioc_1.2.0
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
From tag at granular.com Tue May 15 03:44:26 2012
From: tag at granular.com (Y-h. Taguchi)
Date: Tue, 15 May 2012 10:44:26 +0900
Subject: [Bioc-devel] New Package: MiRaGE + miRNATarget,
Message-ID:
Dear Sirs,
Here, I would like to announce that my package, MiRaGE, was added to
development slot.
http://www.bioconductor.org/packages/devel/bioc/html/MiRaGE.html
This is a localized version of Web server, MiRaGE Server,
http://www.granular.com/MiRaGE/, which intends to
infer target gene regulation via miRNA using only target gene
expression profile.
I am glad if some one can comment about it such that I can improve it
as much as possible.
I do not think that there are so many Japanese here :-)
I am glad if you can help me since we Japanese do not have so many
friends to talk about the development on Bioconductor, face to face.
yours, tag.
PS MiRaGE needs to install miRNATager experimental package
http://www.bioconductor.org/packages/release/data/experiment/html/miRNATarget.html
together to execute, although I have provided other option to download
data set from MiRaGE server directly.
--
Y-h. Taguchi, Dept. Phys., Chuo Univ., Kasuga, Bunkyo-ku, Tokyo 112-8551,Japan
Tel./Fax.? +81-3-3817-1791/1792? http://www.granular.com/tag/index-j.html
From Inostroza at mpimp-golm.mpg.de Tue May 15 09:34:03 2012
From: Inostroza at mpimp-golm.mpg.de (Alvaro Cuadros Inostroza)
Date: Tue, 15 May 2012 07:34:03 +0000
Subject: [Bioc-devel] mzR compilation error gcc 4.70 arch linux (and a patch)
Message-ID: <4C0888DEB044FB4DA79C41688FA8C7390416D6@MPPMAIL01.mpimp-golm.mpg.de>
Hello,
I got the following compilation error while installing the package 'mzR' (devel version 1.3.6) in arch linux (fully updated) (my package, TargetSearch, depends on mzR). Here is the relevant part.
> biocLite("mzR")
BioC_mirror: http://bioconductor.org
Using R version 2.15, BiocInstaller version 1.5.7.
Installing package(s) 'mzR'
[...]
g++ -I/opt/R/R-2.15.0/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/opt/R/R-2.15.0/library/Rcpp/include" -fpic -g -O2 -c boost/thread/src/pthread/once.cpp -o boost/thread/src/pthread/once.o
In file included from ./boost/thread/detail/platform.hpp:17:0,
from ./boost/thread/once.hpp:12,
from boost/thread/src/pthread/once.cpp:7:
./boost/config/requires_threads.hpp:29:4: error: #error "Threading support unavaliable: it has been explicitly disabled with BOOST_DISABLE_THREADS"
In file included from ./boost/thread/once.hpp:12:0,
from boost/thread/src/pthread/once.cpp:7:
./boost/thread/detail/platform.hpp:67:9: error: #error "Sorry, no boost threads are available for this platform."
[...]
The full error log is here [1].
I also got the same error with the release version of mzR (1.2.1). With an older gcc I do *not* get this error.
Since it seemed a problem with the boost libraries, I searched the web and found a bug report [2] in which they explain it's a configuration error due to a change in gcc 4.70 (or something like that). Also, in that page a fix and patch is provided (see link at the bottom) which I adapted and pasted here [3] for mzR. It works for both release and devel versions. At least it fixed the compilation error for me. Maybe it needs more testing...
[1] http://pastebin.com/T2tSEWPM
[2] https://svn.boost.org/trac/boost/ticket/6165
[3] http://pastebin.com/gYBAr2Td
[alvaro at home ~]$ gcc --version
gcc (GCC) 4.7.0 20120505 (prerelease)
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BiocInstaller_1.5.7
loaded via a namespace (and not attached):
[1] tools_2.15.0
Best regards.
[ CC: to the mainteners ]
--
Alvaro
From sneumann at ipb-halle.de Tue May 15 10:09:23 2012
From: sneumann at ipb-halle.de (Steffen Neumann)
Date: Tue, 15 May 2012 10:09:23 +0200
Subject: [Bioc-devel] mzR compilation error gcc 4.70 arch linux (and a
patch)
In-Reply-To: <4C0888DEB044FB4DA79C41688FA8C7390416D6@MPPMAIL01.mpimp-golm.mpg.de>
References: <4C0888DEB044FB4DA79C41688FA8C7390416D6@MPPMAIL01.mpimp-golm.mpg.de>
Message-ID:
Hi Alvaro,
On Tue, 2012-05-15 at 07:34 +0000, Alvaro Cuadros Inostroza wrote:
> I got the following compilation error while installing the package
> 'mzR' (devel version 1.3.6) in arch linux (fully updated) (my package,
> TargetSearch, depends on mzR). Here is the relevant part.
Thanks for the notice and the patch. I applied it to the devel version,
it compiles and check fine on my gcc-4.6 and mzR-1.3.7
should be out soon. Please report if there's anything missing.
Yours,
Steffen
>
> > biocLite("mzR")
> BioC_mirror: http://bioconductor.org
> Using R version 2.15, BiocInstaller version 1.5.7.
> Installing package(s) 'mzR'
>
> [...]
>
> g++ -I/opt/R/R-2.15.0/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/opt/R/R-2.15.0/library/Rcpp/include" -fpic -g -O2 -c boost/thread/src/pthread/once.cpp -o boost/thread/src/pthread/once.o
> In file included from ./boost/thread/detail/platform.hpp:17:0,
> from ./boost/thread/once.hpp:12,
> from boost/thread/src/pthread/once.cpp:7:
> ./boost/config/requires_threads.hpp:29:4: error: #error "Threading support unavaliable: it has been explicitly disabled with BOOST_DISABLE_THREADS"
> In file included from ./boost/thread/once.hpp:12:0,
> from boost/thread/src/pthread/once.cpp:7:
> ./boost/thread/detail/platform.hpp:67:9: error: #error "Sorry, no boost threads are available for this platform."
>
> [...]
>
> The full error log is here [1].
>
> I also got the same error with the release version of mzR (1.2.1). With an older gcc I do *not* get this error.
>
> Since it seemed a problem with the boost libraries, I searched the web and found a bug report [2] in which they explain it's a configuration error due to a change in gcc 4.70 (or something like that). Also, in that page a fix and patch is provided (see link at the bottom) which I adapted and pasted here [3] for mzR. It works for both release and devel versions. At least it fixed the compilation error for me. Maybe it needs more testing...
>
> [1] http://pastebin.com/T2tSEWPM
> [2] https://svn.boost.org/trac/boost/ticket/6165
> [3] http://pastebin.com/gYBAr2Td
>
> [alvaro at home ~]$ gcc --version
> gcc (GCC) 4.7.0 20120505 (prerelease)
> Copyright (C) 2012 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
>
> > sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] BiocInstaller_1.5.7
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.0
>
> Best regards.
> [ CC: to the mainteners ]
>
--
IPB Halle AG Massenspektrometrie & Bioinformatik
Dr. Steffen Neumann http://www.IPB-Halle.DE
Weinberg 3 http://msbi.bic-gh.de
06120 Halle Tel. +49 (0) 345 5582 - 1470
+49 (0) 345 5582 - 0
sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409
From D.Strbenac at garvan.org.au Wed May 16 07:00:21 2012
From: D.Strbenac at garvan.org.au (Dario Strbenac)
Date: Wed, 16 May 2012 15:00:21 +1000 (EST)
Subject: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
Message-ID: <20120516150021.BXA20991@gimr.garvan.unsw.edu.au>
Hello,
I thought I'd suggest reordering the steps that are taken when makeFeatureDbFromUCSC is called. It would be better if the column name checking step was done before an entire table of data was downloaded and then an error was thrown.
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
From tim.triche at gmail.com Wed May 16 07:12:07 2012
From: tim.triche at gmail.com (Tim Triche, Jr.)
Date: Tue, 15 May 2012 22:12:07 -0700
Subject: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
In-Reply-To: <20120516150021.BXA20991@gimr.garvan.unsw.edu.au>
References: <20120516150021.BXA20991@gimr.garvan.unsw.edu.au>
Message-ID:
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL:
From hpages at fhcrc.org Wed May 16 07:21:56 2012
From: hpages at fhcrc.org (=?ISO-8859-1?Q?Herv=E9_Pag=E8s?=)
Date: Tue, 15 May 2012 22:21:56 -0700
Subject: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
In-Reply-To:
References: <20120516150021.BXA20991@gimr.garvan.unsw.edu.au>
Message-ID: <4FB33974.1080000@fhcrc.org>
Hi Dario, Tim,
Can you guys show an example so we know exactly what you mean. Sorry if
it's obvious. Thanks!
H.
On 05/15/2012 10:12 PM, Tim Triche, Jr. wrote:
> seconding this!
>
>
> On Tue, May 15, 2012 at 10:00 PM, Dario Strbenac
> wrote:
>
>> Hello,
>>
>> I thought I'd suggest reordering the steps that are taken when
>> makeFeatureDbFromUCSC is called. It would be better if the column name
>> checking step was done before an entire table of data was downloaded and
>> then an error was thrown.
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>
>
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
From D.Strbenac at garvan.org.au Wed May 16 08:00:12 2012
From: D.Strbenac at garvan.org.au (Dario Strbenac)
Date: Wed, 16 May 2012 16:00:12 +1000 (EST)
Subject: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
Message-ID: <20120516160012.BXA22455@gimr.garvan.unsw.edu.au>
> repeatDB <- makeFeatureDbFromUCSC("hg18", "RepeatMasker", "rmsk")
Download the rmsk table ... OK # Takes a few minutes
Checking that required Columns are present ...
Error in makeFeatureDbFromUCSC("hg18", "RepeatMasker", "rmsk") :
GenomicFeatures internal error: rmsk table doesn't contain a 'chrom', 'chromStart', or 'chromEnd' column and no reasonable substitute has been designated via the 'chromCol''chromStartCol' or 'chromEndCol' arguments.
---- Original message ----
>Date: Tue, 15 May 2012 22:21:56 -0700
>From: Herv? Pag?s
>Subject: Re: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
>To: ttriche at usc.edu
>Cc: "Tim Triche, Jr." , D.Strbenac at garvan.org.au, bioc-devel at r-project.org
>
>Hi Dario, Tim,
>
>Can you guys show an example so we know exactly what you mean. Sorry if
>it's obvious. Thanks!
>
>H.
>
>
>On 05/15/2012 10:12 PM, Tim Triche, Jr. wrote:
>> seconding this!
>>
>>
>> On Tue, May 15, 2012 at 10:00 PM, Dario Strbenac
>> wrote:
>>
>>> Hello,
>>>
>>> I thought I'd suggest reordering the steps that are taken when
>>> makeFeatureDbFromUCSC is called. It would be better if the column name
>>> checking step was done before an entire table of data was downloaded and
>>> then an error was thrown.
>>>
>>> --------------------------------------
>>> Dario Strbenac
>>> Research Assistant
>>> Cancer Epigenetics
>>> Garvan Institute of Medical Research
>>> Darlinghurst NSW 2010
>>> Australia
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>>
>
>
>--
>Herv? Pag?s
>
>Program in Computational Biology
>Division of Public Health Sciences
>Fred Hutchinson Cancer Research Center
>1100 Fairview Ave. N, M1-B514
>P.O. Box 19024
>Seattle, WA 98109-1024
>
>E-mail: hpages at fhcrc.org
>Phone: (206) 667-5791
>Fax: (206) 667-1319
From hpages at fhcrc.org Wed May 16 18:24:21 2012
From: hpages at fhcrc.org (=?UTF-8?B?SGVydsOpIFBhZ8Oocw==?=)
Date: Wed, 16 May 2012 09:24:21 -0700
Subject: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
In-Reply-To: <20120516160012.BXA22455@gimr.garvan.unsw.edu.au>
References: <20120516160012.BXA22455@gimr.garvan.unsw.edu.au>
Message-ID: <4FB3D4B5.1070000@fhcrc.org>
On 05/15/2012 11:00 PM, Dario Strbenac wrote:
>> repeatDB<- makeFeatureDbFromUCSC("hg18", "RepeatMasker", "rmsk")
> Download the rmsk table ... OK # Takes a few minutes
> Checking that required Columns are present ...
> Error in makeFeatureDbFromUCSC("hg18", "RepeatMasker", "rmsk") :
> GenomicFeatures internal error: rmsk table doesn't contain a 'chrom', 'chromStart', or 'chromEnd' column and no reasonable substitute has been designated via the 'chromCol''chromStartCol' or 'chromEndCol' arguments.
Yes it was obvious (if I had read "makeFeatureDbFromUCSC" instead of
"makeTranscriptDbFromUCSC"). Makes a lot of sense and should be an easy
change. Thanks!
H.
>
> ---- Original message ----
>> Date: Tue, 15 May 2012 22:21:56 -0700
>> From: Herv? Pag?s
>> Subject: Re: [Bioc-devel] makeFeatureDbFromUCSC Column Checking
>> To: ttriche at usc.edu
>> Cc: "Tim Triche, Jr.", D.Strbenac at garvan.org.au, bioc-devel at r-project.org
>>
>> Hi Dario, Tim,
>>
>> Can you guys show an example so we know exactly what you mean. Sorry if
>> it's obvious. Thanks!
>>
>> H.
>>
>>
>> On 05/15/2012 10:12 PM, Tim Triche, Jr. wrote:
>>> seconding this!
>>>
>>>
>>> On Tue, May 15, 2012 at 10:00 PM, Dario Strbenac
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I thought I'd suggest reordering the steps that are taken when
>>>> makeFeatureDbFromUCSC is called. It would be better if the column name
>>>> checking step was done before an entire table of data was downloaded and
>>>> then an error was thrown.
>>>>
>>>> --------------------------------------
>>>> Dario Strbenac
>>>> Research Assistant
>>>> Cancer Epigenetics
>>>> Garvan Institute of Medical Research
>>>> Darlinghurst NSW 2010
>>>> Australia
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Herv? Pag?s
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
From sdmorris at u.washington.edu Thu May 17 00:39:32 2012
From: sdmorris at u.washington.edu (Stephanie M. Gogarten)
Date: Wed, 16 May 2012 15:39:32 -0700
Subject: [Bioc-devel] linking to suggested package in documentation
Message-ID: <4FB42CA4.4000907@u.washington.edu>
Is is possible to include a link to a package/function/class in the
documentation if that package is only listed in "Suggests" rather than
"Depends" or "Imports"? I tried to do this, but I got a warning for a
missing link during R CMD check.
Stephanie
From tag at granular.com Thu May 17 01:30:45 2012
From: tag at granular.com (Y-h. Taguchi)
Date: Thu, 17 May 2012 08:30:45 +0900
Subject: [Bioc-devel] linking to suggested package in documentation
In-Reply-To: <4FB42CA4.4000907@u.washington.edu>
References: <4FB42CA4.4000907@u.washington.edu>
Message-ID:
Dear Steohanie,
2012/5/17 Stephanie M. Gogarten :
> Is is possible to include a link to a package/function/class in the
> documentation if that package is only listed in "Suggests" rather than
> "Depends" or "Imports"?
Yes, you can, but....
>I tried to do this, but I got a warning for a
> missing link during R CMD check.
In order to run "R CMD check" properly, you need to install everything
in Suggests","Depends" or "Imports" in your ssytem, since it tries to
execute every example in vignette.
Have you tried it?
yours, tag.
>
> Stephanie
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Y-h. Taguchi, Dept. Phys., Chuo Univ., Kasuga, Bunkyo-ku, Tokyo 112-8551,Japan
Tel./Fax. +81-3-3817-1791/1792 http://www.granular.com/tag/index-j.html
?112-8551 ???????? ???? ???? ??/FAX 03-3817-1791/1792
From D.Strbenac at garvan.org.au Thu May 17 08:00:15 2012
From: D.Strbenac at garvan.org.au (Dario Strbenac)
Date: Thu, 17 May 2012 16:00:15 +1000 (EST)
Subject: [Bioc-devel] Rsamtools filterBam Functionality
Message-ID: <20120517160015.BXA45216@gimr.garvan.unsw.edu.au>
Hello,
I'm interested in filtering a BAM file by read ID. I've read in two BAM files of two different mappings of the same FASTQ file, found which IDs are unique to one of them, and want to create a BAM file of these. This doesn't look possible from the options available to filterBam. Could that be extended in a future release ?
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
From sneumann at ipb-halle.de Thu May 17 13:57:00 2012
From: sneumann at ipb-halle.de (Steffen Neumann)
Date: Thu, 17 May 2012 13:57:00 +0200
Subject: [Bioc-devel] Happy Birthday Bioconductor !
Message-ID:
Hi,
did I miss it, or has nobody celebrated that
the Bioconductor project had released BioC 1.0
back on May 1st, 2002 TEN YEARS AGO ?
In any case: happy Birthday to Bioconductor,
congratulations to ten years of healthy growth,
and may the next ten years bring new and more
awesomeness to the project !
Thanks to the whole core team and all contributors
for making this happen! Let's open a virtual (or real)
bottle of champagne !
Yours,
Steffen
(who just created a slide on what BioC is,
and looked on Wikipedia for the project's history ;-)
--
IPB Halle AG Massenspektrometrie & Bioinformatik
Dr. Steffen Neumann http://www.IPB-Halle.DE
Weinberg 3 http://msbi.bic-gh.de
06120 Halle Tel. +49 (0) 345 5582 - 1470
+49 (0) 345 5582 - 0
sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409
From sdmorris at u.washington.edu Thu May 17 17:16:16 2012
From: sdmorris at u.washington.edu (Stephanie M. Gogarten)
Date: Thu, 17 May 2012 08:16:16 -0700
Subject: [Bioc-devel] linking to suggested package in documentation
In-Reply-To:
References:
Message-ID: <4FB51640.5010800@u.washington.edu>
The package is installed on my system. If it is listed in the
"Suggests" field, I get the warning
* checking Rd cross-references ... WARNING
Missing link(s) in documentation object
?/Volumes/geneva_sata/stephanie/Bioconductor/GWASTools/man/snpStats.Rd?:
?SnpMatrix-class?
If I move the "snpStats" package from "Suggests" to "Imports," that
warning goes away.
I can see why R would warn about documentation links to packages in
"Suggests", because if the package is not installed the link would be
broken. But I was wondering if there was a clever way to convince R CMD
check that packages in "Suggests" should be considered valid for
documentation links.
thanks,
Stephanie
On 5/17/12 3:00 AM, bioc-devel-request at r-project.org wrote:
> Date: Thu, 17 May 2012 08:30:45 +0900 From: "Y-h. Taguchi"
> To: bioc-devel at r-project.org Subject: Re:
> [Bioc-devel] linking to suggested package in documentation Message-ID:
>
> Content-Type: text/plain; charset=ISO-2022-JP Dear Steohanie, 2012/5/17
> Stephanie M. Gogarten :
>> > Is is possible to include a link to a package/function/class in the
>> > documentation if that package is only listed in "Suggests" rather than
>> > "Depends" or "Imports"?
> Yes, you can, but....
>
>> >I tried to do this, but I got a warning for a
>> > missing link during R CMD check.
> In order to run "R CMD check" properly, you need to install everything
> in Suggests","Depends" or "Imports" in your ssytem, since it tries to
> execute every example in vignette.
>
> Have you tried it?
>
> yours, tag.
>
>> >
>> > Stephanie
>> >
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> -- Y-h. Taguchi, Dept. Phys., Chuo Univ., Kasuga, Bunkyo-ku, Tokyo
> 112-8551,Japan Tel./Fax. +81-3-3817-1791/1792
> http://www.granular.com/tag/index-j.html ?112-8551 ???????? ???? ????
> ??/FAX 03-3817-1791/1792
From mtmorgan at fhcrc.org Thu May 17 22:03:02 2012
From: mtmorgan at fhcrc.org (Martin Morgan)
Date: Thu, 17 May 2012 13:03:02 -0700
Subject: [Bioc-devel] linking to suggested package in documentation
In-Reply-To: <4FB51640.5010800@u.washington.edu>
References:
<4FB51640.5010800@u.washington.edu>
Message-ID: <4FB55976.1080302@fhcrc.org>
On 05/17/2012 08:16 AM, Stephanie M. Gogarten wrote:
> The package is installed on my system. If it is listed in the "Suggests"
> field, I get the warning
>
> * checking Rd cross-references ... WARNING
> Missing link(s) in documentation object
> ?/Volumes/geneva_sata/stephanie/Bioconductor/GWASTools/man/snpStats.Rd?:
> ?SnpMatrix-class?
>
> If I move the "snpStats" package from "Suggests" to "Imports," that
> warning goes away.
>
> I can see why R would warn about documentation links to packages in
> "Suggests", because if the package is not installed the link would be
> broken. But I was wondering if there was a clever way to convince R CMD
> check that packages in "Suggests" should be considered valid for
> documentation links.
Hi Stephanie --
I think you're looking for sectin 2.5 of Writing R Extensions -- Cross
references, along the lines of
\link[pkg]{foo}
where 'foo' is the name of the _html_ file foo is documented in, or
\link[pkg:bar]{foo}
to find documentation on foo in html page bar.html.
Martin
>
> thanks,
> Stephanie
>
> On 5/17/12 3:00 AM, bioc-devel-request at r-project.org wrote:
>> Date: Thu, 17 May 2012 08:30:45 +0900 From: "Y-h. Taguchi"
>> To: bioc-devel at r-project.org Subject: Re:
>> [Bioc-devel] linking to suggested package in documentation Message-ID:
>>
>> Content-Type: text/plain; charset=ISO-2022-JP Dear Steohanie, 2012/5/17
>> Stephanie M. Gogarten :
>>> > Is is possible to include a link to a package/function/class in the
>>> > documentation if that package is only listed in "Suggests" rather than
>>> > "Depends" or "Imports"?
>> Yes, you can, but....
>>
>>> >I tried to do this, but I got a warning for a
>>> > missing link during R CMD check.
>> In order to run "R CMD check" properly, you need to install everything
>> in Suggests","Depends" or "Imports" in your ssytem, since it tries to
>> execute every example in vignette.
>>
>> Have you tried it?
>>
>> yours, tag.
>>
>>> >
>>> > Stephanie
>>> >
>>> > _______________________________________________
>>> > Bioc-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>> -- Y-h. Taguchi, Dept. Phys., Chuo Univ., Kasuga, Bunkyo-ku, Tokyo
>> 112-8551,Japan Tel./Fax. +81-3-3817-1791/1792
>> http://www.granular.com/tag/index-j.html ?112-8551 ???????? ???? ????
>> ??/FAX 03-3817-1791/1792
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
From mtmorgan at fhcrc.org Fri May 18 18:29:30 2012
From: mtmorgan at fhcrc.org (Martin Morgan)
Date: Fri, 18 May 2012 09:29:30 -0700
Subject: [Bioc-devel] Happy Birthday Bioconductor !
In-Reply-To:
References:
Message-ID: <4FB678EA.6000303@fhcrc.org>
On 05/17/2012 04:57 AM, Steffen Neumann wrote:
> Hi,
>
> did I miss it, or has nobody celebrated that
> the Bioconductor project had released BioC 1.0
> back on May 1st, 2002 TEN YEARS AGO ?
>
> In any case: happy Birthday to Bioconductor,
> congratulations to ten years of healthy growth,
> and may the next ten years bring new and more
> awesomeness to the project !
>
> Thanks to the whole core team and all contributors
> for making this happen! Let's open a virtual (or real)
> bottle of champagne !
And especially to the far-sighted individuals who contributed to the
original iterations!
Martin
>
> Yours,
> Steffen
>
> (who just created a slide on what BioC is,
> and looked on Wikipedia for the project's history ;-)
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
From vobencha at fhcrc.org Sat May 19 20:01:05 2012
From: vobencha at fhcrc.org (Valerie Obenchain)
Date: Sat, 19 May 2012 11:01:05 -0700
Subject: [Bioc-devel] BioC 2012
Message-ID: <4FB7DFE1.7040208@fhcrc.org>
Hello Bioconductors!
BioC2012 is fast approaching. We have a diverse line up of morning talks
and afternoon practicals. Check them out at
https://secure.bioconductor.org/BioC2012/
If you are interested in giving an afternoon practical (aka lab session)
you can submit your proposal here
https://secure.bioconductor.org/BioC2012/labs.php
There is a poster session Tuesday night 5:30 - 7:00. This is a great
way to share your work or get feedback on in-progress ideas. Poster
abstracts are due by July 1.
Please direct questions or comments to biocworkshop at fhcrc.org
Valerie
From slzhao at sibs.ac.cn Tue May 22 15:14:18 2012
From: slzhao at sibs.ac.cn (slzhao)
Date: Tue, 22 May 2012 21:14:18 +0800
Subject: [Bioc-devel] A question about using proxy when developing R package
Message-ID:
Hello,
I am developing a R package. As I have to use a proxy to
access the internet, so I used the function "setInternet2()" in R to
download CRAN packages. But now I am writing a sweave based help file
in Lyx software. In this help file, the "download.file" function was
used in a example code. So I just used "setInternet2()" in this help
file. Of course it is not good as the end user need not a proxy. Does
anyone know how to resolve this problem?
Thanks for the reply.
--
Shilin Zhao
Key Laboratory of Systems Biology
Shanghai Institute for Biological Sciences
Chinese Academy of Sciences
320 Yue-Yang Road
Shanghai,China,200031
Tel?86-21-54920083
From mtmorgan at fhcrc.org Tue May 22 15:51:35 2012
From: mtmorgan at fhcrc.org (Martin Morgan)
Date: Tue, 22 May 2012 06:51:35 -0700
Subject: [Bioc-devel] A question about using proxy when developing R
package
In-Reply-To:
References:
Message-ID: <4FBB99E7.2050903@fhcrc.org>
On 05/22/2012 06:14 AM, slzhao wrote:
> Hello,
>
> I am developing a R package. As I have to use a proxy to
> access the internet, so I used the function "setInternet2()" in R to
> download CRAN packages. But now I am writing a sweave based help file
> in Lyx software. In this help file, the "download.file" function was
> used in a example code. So I just used "setInternet2()" in this help
> file. Of course it is not good as the end user need not a proxy. Does
> anyone know how to resolve this problem?
I think you want to use a different approach to configuring your own
computer to use the proxy. Arrange to start R with the --internet2
option, or (perhaps only R-devel?) set the environment variable
R_WIN_INTERNET2 on your system. See the R windows FAQ
http://cran.r-project.org/bin/windows/base/rw-FAQ.html
Martin
> Thanks for the reply.
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
From thomas.girke at ucr.edu Tue May 22 19:58:47 2012
From: thomas.girke at ucr.edu (Thomas Girke)
Date: Tue, 22 May 2012 10:58:47 -0700
Subject: [Bioc-devel] read.XStringSet with spaces in or at end of sequence
Message-ID: <20120522175847.GA730@genomics-59-108.bulk.ucr.edu>
Currently, spaces in sequences are handled inconsistently by the FASTA
read functions in Biostrings. This applies to spaces in or at the end of
sequence strings. Because of this users often think Biostrings cannot
handle their sequence data and give up using it which I find
unfortunate.
For instance, given this sequence stored in "test.fasta":
>123
AATTTAAA GGGG
read.DNAStringSet fails to import this sequence which is the
least desirable outcome.
> read.DNAStringSet("test.fasta")
Error in .Call2("read_fasta_in_XStringSet", efp_list, nrec, skip, use.names, :
key 32 (char ' ') not in lookup table
however, read.AAStringSet imports it but maintains the space
> read.AAStringSet("test.fasta")
A AAStringSet instance of length 1
width seq names
[1] 13 AATTTAAA GGGG 123
Wouldn't it make most sense to remove/ignore spaces during the import?
Thomas
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] stats4_2.15.0 tools_2.15.0
From hpages at fhcrc.org Tue May 22 21:39:14 2012
From: hpages at fhcrc.org (=?ISO-8859-1?Q?Herv=E9_Pag=E8s?=)
Date: Tue, 22 May 2012 12:39:14 -0700
Subject: [Bioc-devel] read.XStringSet with spaces in or at end of
sequence
In-Reply-To: <20120522175847.GA730@genomics-59-108.bulk.ucr.edu>
References: <20120522175847.GA730@genomics-59-108.bulk.ucr.edu>
Message-ID: <4FBBEB62.1050102@fhcrc.org>
Hi Thomas,
On 05/22/2012 10:58 AM, Thomas Girke wrote:
> Currently, spaces in sequences are handled inconsistently by the FASTA
> read functions in Biostrings. This applies to spaces in or at the end of
> sequence strings. Because of this users often think Biostrings cannot
> handle their sequence data and give up using it which I find
> unfortunate.
>
> For instance, given this sequence stored in "test.fasta":
>> 123
> AATTTAAA GGGG
>
> read.DNAStringSet fails to import this sequence which is the
> least desirable outcome.
>
>> read.DNAStringSet("test.fasta")
> Error in .Call2("read_fasta_in_XStringSet", efp_list, nrec, skip, use.names, :
> key 32 (char ' ') not in lookup table
>
> however, read.AAStringSet imports it but maintains the space
>
>> read.AAStringSet("test.fasta")
> A AAStringSet instance of length 1
> width seq names
> [1] 13 AATTTAAA GGGG 123
Note that this doesn't fail because the letters in an AAStringSet
object can be anything right now, but it's on my TODO list to change
this i.e. it will become an error to try to store a letter in an
AAStringSet that doesn't belong to the Amino Acid alphabet (stored
in predefined constant AA_ALPHABET).
So the import function to use when one doesn't want to enforce a
particular alphabet is read.BStringSet():
> read.BStringSet("test.fasta")
A BStringSet instance of length 1
width seq names
[1] 13 AATTTAAA GGGG 123
The other functions in the family (i.e. read.DNAStringSet,
read.RNAStringSet, and read.AAStringSet) will fail if the FASTA file
contains letters that are not in DNA_ALPHABET, RNA_ALPHABET, or
AA_ALPHABET, respectively.
>
> Wouldn't it make most sense to remove/ignore spaces during the import?
According to Wikipeddia
http://en.wikipedia.org/wiki/FASTA_format
yes the spaces and any other invalid code should be ignored. My concern
with this behavior though is that removing/ignoring letters in the input
will shift the positions of all the remaining letters, which for
some use cases is not desirable (maybe everything is fine because all
the letters end up at the right position anyway, but maybe not, hard
to tell without knowing why a space was inserted in the file in the
first place).
Note that we have special letters in the DNA/RNA/AA alphabets that
could be used as a replacement for invalid chars:
> DNA_ALPHABET
[1] "A" "C" "G" "T" "M" "R" "W" "S" "Y" "K" "V" "H" "D" "B" "N" "-" "+"
> RNA_ALPHABET
[1] "A" "C" "G" "U" "M" "R" "W" "S" "Y" "K" "V" "H" "D" "B" "N" "-" "+"
> AA_ALPHABET
[1] "A" "R" "N" "D" "C" "Q" "E" "G" "H" "I" "L" "K" "M" "F" "P" "S"
"T" "W" "Y"
[20] "V" "U" "B" "Z" "X" "*" "-" "+"
"-" stands for "gap" and "+" is used for hard masking. IMO they are
both reasonable candidates. I propose to add an extra arg (e.g.
if.invalid.char) to read.DNAStringSet, read.RNAStringSet, and
read.AAStringSet to let the user choose what the substitution letter
should be, e.g. if.invalid.char="+", or if.invalid.char="" (for
removing the invalid letters).
Now should we set its default to "" (and strictly follow the FASTA
spec), or should we set it to NA so by default an error would still
be raised if the file contains invalid chars? I prefer the latter
because I think it's good to let the user know that there is something
uncommon (at best) or potentially wrong with the file.
Thanks for your feedback,
H.
>
> Thomas
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] stats4_2.15.0 tools_2.15.0
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
From thomas.girke at ucr.edu Tue May 22 22:35:41 2012
From: thomas.girke at ucr.edu (Thomas Girke)
Date: Tue, 22 May 2012 13:35:41 -0700
Subject: [Bioc-devel] read.XStringSet with spaces in or at end of
sequence
In-Reply-To: <486d7f5ce8ed4931aaf8d6319acae968@EXCH-HT-2.exch.ucr.edu>
References: <20120522175847.GA730@genomics-59-108.bulk.ucr.edu>
<486d7f5ce8ed4931aaf8d6319acae968@EXCH-HT-2.exch.ucr.edu>
Message-ID: <20120522203541.GA1069@genomics-59-108.bulk.ucr.edu>
Herv?,
I agree, an argument where the user has to explicitly decide how to handle
unusual characters (e.g. if.invalid.char="") would solve this in the most
sensible manner.
Thomas
On Tue, May 22, 2012 at 07:39:14PM +0000, Herv? Pag?s wrote:
> Hi Thomas,
>
> On 05/22/2012 10:58 AM, Thomas Girke wrote:
> > Currently, spaces in sequences are handled inconsistently by the FASTA
> > read functions in Biostrings. This applies to spaces in or at the end of
> > sequence strings. Because of this users often think Biostrings cannot
> > handle their sequence data and give up using it which I find
> > unfortunate.
> >
> > For instance, given this sequence stored in "test.fasta":
> >> 123
> > AATTTAAA GGGG
> >
> > read.DNAStringSet fails to import this sequence which is the
> > least desirable outcome.
> >
> >> read.DNAStringSet("test.fasta")
> > Error in .Call2("read_fasta_in_XStringSet", efp_list, nrec, skip, use.names, :
> > key 32 (char ' ') not in lookup table
> >
> > however, read.AAStringSet imports it but maintains the space
> >
> >> read.AAStringSet("test.fasta")
> > A AAStringSet instance of length 1
> > width seq names
> > [1] 13 AATTTAAA GGGG 123
>
> Note that this doesn't fail because the letters in an AAStringSet
> object can be anything right now, but it's on my TODO list to change
> this i.e. it will become an error to try to store a letter in an
> AAStringSet that doesn't belong to the Amino Acid alphabet (stored
> in predefined constant AA_ALPHABET).
>
> So the import function to use when one doesn't want to enforce a
> particular alphabet is read.BStringSet():
>
> > read.BStringSet("test.fasta")
> A BStringSet instance of length 1
> width seq names
>
> [1] 13 AATTTAAA GGGG 123
>
> The other functions in the family (i.e. read.DNAStringSet,
> read.RNAStringSet, and read.AAStringSet) will fail if the FASTA file
> contains letters that are not in DNA_ALPHABET, RNA_ALPHABET, or
> AA_ALPHABET, respectively.
>
> >
> > Wouldn't it make most sense to remove/ignore spaces during the import?
>
> According to Wikipeddia
>
> http://en.wikipedia.org/wiki/FASTA_format
>
> yes the spaces and any other invalid code should be ignored. My concern
> with this behavior though is that removing/ignoring letters in the input
> will shift the positions of all the remaining letters, which for
> some use cases is not desirable (maybe everything is fine because all
> the letters end up at the right position anyway, but maybe not, hard
> to tell without knowing why a space was inserted in the file in the
> first place).
>
> Note that we have special letters in the DNA/RNA/AA alphabets that
> could be used as a replacement for invalid chars:
>
> > DNA_ALPHABET
> [1] "A" "C" "G" "T" "M" "R" "W" "S" "Y" "K" "V" "H" "D" "B" "N" "-" "+"
> > RNA_ALPHABET
> [1] "A" "C" "G" "U" "M" "R" "W" "S" "Y" "K" "V" "H" "D" "B" "N" "-" "+"
> > AA_ALPHABET
> [1] "A" "R" "N" "D" "C" "Q" "E" "G" "H" "I" "L" "K" "M" "F" "P" "S"
> "T" "W" "Y"
> [20] "V" "U" "B" "Z" "X" "*" "-" "+"
>
> "-" stands for "gap" and "+" is used for hard masking. IMO they are
> both reasonable candidates. I propose to add an extra arg (e.g.
> if.invalid.char) to read.DNAStringSet, read.RNAStringSet, and
> read.AAStringSet to let the user choose what the substitution letter
> should be, e.g. if.invalid.char="+", or if.invalid.char="" (for
> removing the invalid letters).
>
> Now should we set its default to "" (and strictly follow the FASTA
> spec), or should we set it to NA so by default an error would still
> be raised if the file contains invalid chars? I prefer the latter
> because I think it's good to let the user know that there is something
> uncommon (at best) or potentially wrong with the file.
>
> Thanks for your feedback,
> H.
>
>
> >
> > Thomas
> >
> >> sessionInfo()
> > R version 2.15.0 (2012-03-30)
> > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > other attached packages:
> > [1] Biostrings_2.24.1 IRanges_1.14.2 BiocGenerics_0.2.0
> >
> > loaded via a namespace (and not attached):
> > [1] stats4_2.15.0 tools_2.15.0
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
> --
> Herv? Pag?s
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
From lgoff at csail.mit.edu Wed May 23 23:54:13 2012
From: lgoff at csail.mit.edu (lgoff at csail.mit.edu)
Date: Wed, 23 May 2012 17:54:13 -0400
Subject: [Bioc-devel] Package download stats inflated? (specifically
cummeRbund)
Message-ID: <20120523175413.15365d9suiqna64l@webmail.csail.mit.edu>
Hi Bioc-devel,
I am the package maintainer for the cummeRbund package and since I'm
not exactly sure to whom I should ask this question, I decided to post
to the bioc-devel list.
Since this is my first Bioc package I have been keenly interested in
the download stats that are tracked and visible on the Bioconductor
website, here:
http://bioconductor.org/packages/stats/index.html
Specifically, I'm noticing that the number of downloads for the
cummeRbund package seems to far outpace the number of unique IP
addresses downloading the package:
http://bioconductor.org/packages/stats/bioc/cummeRbund.html
For a few months there was a mean of between 10-20 downloads per
unique IP address, and for the current month this is on track to be
about 36 downloads/IP (and looks to be about 8.7% of the total BioC
packages downloaded this month so far). Looking around at several
other packages, this does not seem to be the case as most of the
packages in the top 30 list have a ratio of about 1.8-3 downloads / IP.
As ecstatic as these numbers make me, I'm certain that there is some
underlying reason for this inflation that is not being appropriately
represented here, but without anything else to go on, I'm not really
sure where this is coming from. I would obviously like to have an
honest representation of the number of downloads for my package, and I
was hoping that someone with access to these data could help me track
down the cause of this download inflation (unless these numbers are a
true representation of the downloads, and then I would also very much
like to find out more demographics if possible as well).
Any and all advice or information is appreciated! Thanks to all, and
a special thanks to everyone that helps to keep BioC such an amazing
project. I have enjoyed the benefits of bioconductor for the past 5+
years and I'm very happy that I can finally start to contribute back
to this wonderful project. (Also, I look forward to meeting some of
you at BioC 2012 this year!)
Thanks in advance!
Cheers,
Loyal Goff
(lgoff at csail.mit.edu)
NSF Postdoctoral Fellow
Computer Science and Artificial Intelligence Laboratory, MIT &
Stem Cells and Regenerative Biology Department, Harvard University &
The Broad Institute
From julian.gehring at embl.de Thu May 24 14:32:44 2012
From: julian.gehring at embl.de (Julian Gehring)
Date: Thu, 24 May 2012 14:32:44 +0200
Subject: [Bioc-devel] ShortRead: 'qa' fails for single read alignments
Message-ID: <4FBE2A6C.7080700@embl.de>
Hi,
while using the 'ShortRead' package for some quality assessment of
aligned reads (see example below), I observed the following behavior:
## Example code ##
library(ShortRead)
qa1 <- qa(dirPath="tmp/", pattern="*sub.bam", type="BAM")
report_html(qa1, dest="out")
##
1. For R-2.14.0, the report is built as expected (see
http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.14.0.pdf
for a comparison).
2. For R-2.15.0, the cycle-specific base calls and read quality plot
looks mixed up (see
http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.15.0.pdf).
3. For R-2.16.0devel (2012-05-24 r59439), the 'qa' command fails with
the error message:
""
Error: ValueUnavailable
0 elements returned; expected >=1
In addition: Warning message:
UnspecifiedWarning
elements: 1 2 3 4
UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
'is' must be character(1) in 'isPaired' 'isProperPair'
'isUnmappedQuery' 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand'
'isFirstMateRead' 'isSecondMateRead' 'isNotPrimaryRead'
'isNotPassingQualityControls' 'isDuplicate'
UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
'is' must be character(1) in 'isPaired' 'isProperPair'
'isUnmappedQuery' 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand'
'isFirstMateRead' 'isSecondMateRead' 'isNotPrimaryRead'
'isNotPassingQualityControls' 'isDuplicate'
UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
'is' must be character(1) in 'isPaired' 'isProperPair'
'isUnmappedQuery' 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand'
'isFirstMateRead' 'isSecondMateRead' 'isNotPrimaryRead'
'isNotPassingQualityControls' 'isDuplicate'
UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
'is' must be character(1) in 'isPaired' [... truncated]
""
See also
- http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.14.txt
- http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.15.txt
- http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.16.txt
for the corresponding session infos.
Can this be caused by having BAM files with single-read alignments?
Also, I'm not sure if the different behavior for R-2.15 and R-2.16 is
directly related.
Best
Julian
From mtmorgan at fhcrc.org Thu May 24 17:30:22 2012
From: mtmorgan at fhcrc.org (Martin Morgan)
Date: Thu, 24 May 2012 08:30:22 -0700
Subject: [Bioc-devel] ShortRead: 'qa' fails for single read alignments
In-Reply-To: <4FBE2A6C.7080700@embl.de>
References: <4FBE2A6C.7080700@embl.de>
Message-ID: <4FBE540E.1020900@fhcrc.org>
Thanks Julian -- these were separate issues (3 was from a recent change
in Rsamtools, the other is more long-standing). Corrected in svn and
1.14.4 / 1.15.6 when these become available.
Martin
On 05/24/2012 05:32 AM, Julian Gehring wrote:
> Hi,
>
> while using the 'ShortRead' package for some quality assessment of
> aligned reads (see example below), I observed the following behavior:
>
> ## Example code ##
>
> library(ShortRead)
> qa1 <- qa(dirPath="tmp/", pattern="*sub.bam", type="BAM")
> report_html(qa1, dest="out")
>
> ##
>
> 1. For R-2.14.0, the report is built as expected (see
> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.14.0.pdf
> for a comparison).
>
> 2. For R-2.15.0, the cycle-specific base calls and read quality plot
> looks mixed up (see
> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.15.0.pdf).
>
>
> 3. For R-2.16.0devel (2012-05-24 r59439), the 'qa' command fails with
> the error message:
> ""
> Error: ValueUnavailable
> 0 elements returned; expected >=1
> In addition: Warning message:
> UnspecifiedWarning
> elements: 1 2 3 4
> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
> 'isDuplicate'
> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
> 'isDuplicate'
> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
> 'isDuplicate'
> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
> 'is' must be character(1) in 'isPaired' [... truncated]
> ""
>
> See also
> - http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.14.txt
> - http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.15.txt
> - http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.16.txt
> for the corresponding session infos.
>
> Can this be caused by having BAM files with single-read alignments?
> Also, I'm not sure if the different behavior for R-2.15 and R-2.16 is
> directly related.
>
>
> Best
> Julian
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
From hpages at fhcrc.org Thu May 24 22:15:56 2012
From: hpages at fhcrc.org (=?ISO-8859-1?Q?Herv=E9_Pag=E8s?=)
Date: Thu, 24 May 2012 13:15:56 -0700
Subject: [Bioc-devel] Package download stats inflated? (specifically
cummeRbund)
In-Reply-To: <20120523175413.15365d9suiqna64l@webmail.csail.mit.edu>
References: <20120523175413.15365d9suiqna64l@webmail.csail.mit.edu>
Message-ID: <4FBE96FC.103@fhcrc.org>
Hi Loyal,
The high ratio between nb of downloads and nb of unique IPs should
not be a reason to doubt that these numbers are a true representation
of the downloads. We've already seen this before. See for example the
stats for the ChIPpeakAnno package:
http://bioconductor.org/packages/stats/bioc/ChIPpeakAnno.html
The package got downloaded 67k times in Oct/Nov 2011 from only 573
distinct IPs, so here the ratio is 117 downloads / IP.
The first time we saw this kind of massive repetitive downloads was
for the biomaRt package more than 1 year ago. We investigated it and
discovered that most downloads (> 95%) were coming from a single IP
(the IP itself was from a University somewhere in the US). We don't
know for sure why they needed to download the same package again and
again thousands of times every day for more than 20 days in a row, but
one explanation could be that they were using some kind of dumb script
to install biomaRt on each node of a big cluster. What's strange though
is that we saw the deluge of downloads for a single package (biomaRt)
and not for a subset of Bioconductor packages (it sounds to me that
the people in charge of a cluster would typically install more than
1 BioC package). But maybe they were testing a script on 1 package,
then realized they could improve it (to download each package only
once), and then used the improved script to actually deploy Bioconductor
on their cluster. Hard to know...
Anyway, because those massive repetitive downloads are possible, maybe
we should put more emphasis on the nb of distinct IPs. This number is
probably more representative of the number of users and therefore is
a better indicator of how much a package is actually used.
Cheers,
H.
On 05/23/2012 02:54 PM, lgoff at csail.mit.edu wrote:
> Hi Bioc-devel,
> I am the package maintainer for the cummeRbund package and since I'm not
> exactly sure to whom I should ask this question, I decided to post to
> the bioc-devel list.
>
> Since this is my first Bioc package I have been keenly interested in the
> download stats that are tracked and visible on the Bioconductor website,
> here:
>
> http://bioconductor.org/packages/stats/index.html
>
> Specifically, I'm noticing that the number of downloads for the
> cummeRbund package seems to far outpace the number of unique IP
> addresses downloading the package:
>
> http://bioconductor.org/packages/stats/bioc/cummeRbund.html
>
> For a few months there was a mean of between 10-20 downloads per unique
> IP address, and for the current month this is on track to be about 36
> downloads/IP (and looks to be about 8.7% of the total BioC packages
> downloaded this month so far). Looking around at several other packages,
> this does not seem to be the case as most of the packages in the top 30
> list have a ratio of about 1.8-3 downloads / IP.
>
> As ecstatic as these numbers make me, I'm certain that there is some
> underlying reason for this inflation that is not being appropriately
> represented here, but without anything else to go on, I'm not really
> sure where this is coming from. I would obviously like to have an honest
> representation of the number of downloads for my package, and I was
> hoping that someone with access to these data could help me track down
> the cause of this download inflation (unless these numbers are a true
> representation of the downloads, and then I would also very much like to
> find out more demographics if possible as well).
>
> Any and all advice or information is appreciated! Thanks to all, and a
> special thanks to everyone that helps to keep BioC such an amazing
> project. I have enjoyed the benefits of bioconductor for the past 5+
> years and I'm very happy that I can finally start to contribute back to
> this wonderful project. (Also, I look forward to meeting some of you at
> BioC 2012 this year!)
>
> Thanks in advance!
>
> Cheers,
>
> Loyal Goff
>
> (lgoff at csail.mit.edu)
> NSF Postdoctoral Fellow
> Computer Science and Artificial Intelligence Laboratory, MIT &
> Stem Cells and Regenerative Biology Department, Harvard University &
> The Broad Institute
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
From julian.gehring at embl.de Mon May 28 00:39:46 2012
From: julian.gehring at embl.de (Julian Gehring)
Date: Mon, 28 May 2012 00:39:46 +0200
Subject: [Bioc-devel] ShortRead: 'qa' fails for single read alignments
In-Reply-To: <4FBE540E.1020900@fhcrc.org>
References: <4FBE2A6C.7080700@embl.de> <4FBE540E.1020900@fhcrc.org>
Message-ID: <4FC2AD32.3010600@embl.de>
Hi Martin,
thanks for fixing these issues so quickly - the reports are now built
without problems.
Best
Julian
On 05/24/2012 05:30 PM, Martin Morgan wrote:
> Thanks Julian -- these were separate issues (3 was from a recent change
> in Rsamtools, the other is more long-standing). Corrected in svn and
> 1.14.4 / 1.15.6 when these become available.
>
> Martin
>
> On 05/24/2012 05:32 AM, Julian Gehring wrote:
>> Hi,
>>
>> while using the 'ShortRead' package for some quality assessment of
>> aligned reads (see example below), I observed the following behavior:
>>
>> ## Example code ##
>>
>> library(ShortRead)
>> qa1 <- qa(dirPath="tmp/", pattern="*sub.bam", type="BAM")
>> report_html(qa1, dest="out")
>>
>> ##
>>
>> 1. For R-2.14.0, the report is built as expected (see
>> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.14.0.pdf
>>
>> for a comparison).
>>
>> 2. For R-2.15.0, the cycle-specific base calls and read quality plot
>> looks mixed up (see
>> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/perCycleBaseCall-R-2.15.0.pdf).
>>
>>
>>
>> 3. For R-2.16.0devel (2012-05-24 r59439), the 'qa' command fails with
>> the error message:
>> ""
>> Error: ValueUnavailable
>> 0 elements returned; expected >=1
>> In addition: Warning message:
>> UnspecifiedWarning
>> elements: 1 2 3 4
>> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
>> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
>> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
>> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
>> 'isDuplicate'
>> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
>> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
>> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
>> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
>> 'isDuplicate'
>> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
>> 'is' must be character(1) in 'isPaired' 'isProperPair' 'isUnmappedQuery'
>> 'hasUnmappedMate' 'isMinusStrand' 'isMateMinusStrand' 'isFirstMateRead'
>> 'isSecondMateRead' 'isNotPrimaryRead' 'isNotPassingQualityControls'
>> 'isDuplicate'
>> UnspecifiedError: bamFlagTest(flag, "isValidVendorRead")
>> 'is' must be character(1) in 'isPaired' [... truncated]
>> ""
>>
>> See also
>> -
>> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.14.txt
>> -
>> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.15.txt
>> -
>> http://www.ebi.ac.uk/~jgehring/share/shortRead-pkg/session-info-2.16.txt
>> for the corresponding session infos.
>>
>> Can this be caused by having BAM files with single-read alignments?
>> Also, I'm not sure if the different behavior for R-2.15 and R-2.16 is
>> directly related.
>>
>>
>> Best
>> Julian
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
From michael.d.linderman at gmail.com Mon May 28 17:52:11 2012
From: michael.d.linderman at gmail.com (Michael Linderman)
Date: Mon, 28 May 2012 11:52:11 -0400
Subject: [Bioc-devel] Differences in packages installed on Windows vs. other
build system hosts?
Message-ID: <898829A0-3867-4865-A59F-73AC7E67198A@gmail.com>
Hi BioC,
A package of mine, Spade, is failing to build in the development release on Windows due to missing dependency (recently released igraph0 package), but building without issue on Linux and OSX. Are there different packages installed on the different hosts? The package of interest is available for Windows from CRAN. How often are the package sets on the build machines updated?
Thanks,
Michael Linderman
From dtenenba at fhcrc.org Mon May 28 19:27:54 2012
From: dtenenba at fhcrc.org (Dan Tenenbaum)
Date: Mon, 28 May 2012 10:27:54 -0700
Subject: [Bioc-devel] Differences in packages installed on Windows vs.
other build system hosts?
In-Reply-To: <15913_1338220365_4FC39F4D_15913_3124_1_898829A0-3867-4865-A59F-73AC7E67198A@gmail.com>
References: <15913_1338220365_4FC39F4D_15913_3124_1_898829A0-3867-4865-A59F-73AC7E67198A@gmail.com>
Message-ID:
Hi Michael,
On Mon, May 28, 2012 at 8:52 AM, Michael Linderman
wrote:
> Hi BioC,
>
> A package of mine, Spade, is failing to build in the development release on Windows due to missing dependency (recently released igraph0 package), but building without issue on Linux and OSX. Are there different packages installed on the different hosts? The package of interest is available for Windows from CRAN. How often are the package sets on the build machines updated?
>
This problem should be solved in the next build cycle (tomorrow
morning shortly after 9AM Seattle time).
Dependencies are updated on every build system prior to each build.
There was an issue building igraph0 from source on Windows (as there
was with igraph). I've instructed the build system to install a binary
version instead.
You can see which packages are installed on a given build system by
clicking on the link under "Installed pkgs" for that system. For
example:
http://bioconductor.org/checkResults/devel/bioc-LATEST/moscato1-R-instpkgs.html
Tells you what is installed on moscato1 (windows system). It doesn't
show igraph0 yet but it will after the next build cycle.
Thanks,
Dan
> Thanks,
>
> Michael Linderman
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel