[BioC] ensemblVEP, variant_effect_predictor versions and release schedule

Valerie Obenchain vobencha at fhcrc.org
Mon Dec 30 21:32:50 CET 2013


Hi Thomas,

I've started to add support for versions 67 and 73+.

I assume you're still using version 67 and have the data cached. How are 
you calling the script right now? Do you use the --cache flag or 
--offline flag? Also, please remind me of (point me to) the plug-in 
you're using so I can test that.

Thanks.
Valerie


On 10/10/2013 04:20 PM, Valerie Obenchain wrote:
> Hi all,
>
> As of this release ensemblVEP will support multiple versions of the
> script. This will start with version 73 and move forward. One old
> version (67) will be included as a backwards test case.
>
> The plan is for only the most current version to have the option to
> query the web. Past versions will query data the user has cached. If you
> are using an old version of the script hopefully you have saved the
> appropriate version of the data.
>
> Older versions of the data are available for download but they are not
> the same format which is downloaded when you 'create a cache'. This
> prevents the use of the variant_effect_predictor.pl script to query old
> versions. So, we'll plan to save the cached data from 73 forward.
>
> Val
>
>
>
> On 10/05/2013 07:14 AM, Vincent Carey wrote:
>> The basic situation seems reminiscent of MLInterfaces.  We want to
>> bridge between
>> a familiar environment (R/bioc) and some useful utilities that need not
>> be in sync with bioc.
>>
>> I don't know much about the VEP perl utility.  It seems to me that it
>> could be easy to
>> support multiple local versions of the _perl script_ by exposing
>>
>> ensemblVEP:::.getVepPath()
>>
>> and allowing the user to define the specific script to be called.
>>
>> This part:
>>
>> 'VEPParam(basic=basicOpts(), input=inputOpts(), cache=cacheOpts(),
>>            output=outputOpts(), identifier=identifierOpts(),
>>            colocatedVariants=colocatedVariantsOpts(),
>>            dataformat=dataformatOpts(), filterqc=filterqcOpts(),
>>            database=databaseOpts(), advanced=advancedOpts(), ...)'
>>
>> is a nice way of organizing the extensive command line options, but as
>> the
>> utility evolves, this may prove fragile. It could also be made more
>> flexible, but
>> use cases and specific issues with multiple local versions should be
>> spelled
>> out.
>>
>>
>>
>> On Sat, Oct 5, 2013 at 8:37 AM, Michael Lawrence
>> <lawrence.michael at gene.com <mailto:lawrence.michael at gene.com>> wrote:
>>
>>     On Fri, Oct 4, 2013 at 3:30 PM, Valerie Obenchain
>>     <vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>>wrote:
>>
>>      > Hi Thomas,
>>      >
>>      > I agree the changes to the package have been rapid and I
>>     apologize for
>>      > causing you grief. The question you ask is a good one. It's
>>     difficult to
>>      > know how to best 'freeze' this package for a given release given
>>     that both
>>      > the data and the api are changing (and not necessarily in sync).
>>      >
>>      > We are investigating other approaches and hope to have a solution
>>     soon.
>>      > ensemblVEP is not the only package that falls in this category
>> so the
>>      > solution needs to have a wider scope. One idea is to move
>>     emsemblVEP to
>>      > AnnotationHub. The idea would be that the user would no longer
>>     need the
>>      > perl script locally. This would allow us to create a more
>>     consistent layer
>>      > between the user and the backend and offer version control.
>>      >
>>      >
>>     But it sounds like this idea would simply defer the problem to the
>>     AnnotationHub side. That is, if someone has a customized Ensembl
>>     database,
>>     they will need to keep their local AnnotationHub in sync (are
>> local hubs
>>     supported yet?).  One way forward would be for ensemblVEP to support
>>     multiple versions of the script (as Thomas suggested). There would
>>     be some
>>     policy that versions are only supported for some duration after
>> release.
>>
>>
>>
>>      > Valerie
>>      >
>>      >
>>      > On 10/03/2013 12:52 PM, Thomas Sandmann [guest] wrote:
>>      >
>>      >> Dear Valerie,
>>      >>
>>      >> Currently, the documentation for your ensemblVEP package (BioC
>> 2.13)
>>      >> indicates that variant_effect_predictor.pl
>>     <http://variant_effect_predictor.pl> from ensemble release 73 is
>>      >> required.
>>      >>
>>      >> I was wondering if had any plans on supporting different
>> versions of
>>      >> ensembl's variant_effect_predictor.pl
>>     <http://variant_effect_predictor.pl> script with your ensemblVEP
>>      >> package.
>>      >>
>>      >> As far as I know, ensembl is on a pretty rapid release schedule,
>>     most
>>      >> likely faster than the 6 month schedule of BioC. This is often
>>     also too
>>      >> fast for many users (and our company), who update less
>>     frequently than
>>      >> ensembl itself, relying on previous ensembl and
>>     variant_effect_predictor
>>      >> versions. For example, we are currently still using the database
>>     schemas
>>      >> and tools of ensembl 67.
>>      >>
>>      >> The variant_effect_predictor.pl
>>     <http://variant_effect_predictor.pl> script also evolves with the
>>     ensembl
>>      >> releases. For example, new command line parameters have been
>>     added in
>>      >> recent versions ( e.g. --database, --dir_cache, --dir_plugins)
>>     and others
>>      >> have been modified (e.g. the handling of plugins).
>>      >>
>>      >> Would you mind sharing your thoughts on how you are planning to
>>     support
>>      >> past and / or future versions of variant_effect_predictor.pl
>>     <http://variant_effect_predictor.pl> ? For
>>      >> example, are you planning to explicitly support a single version
>>     with every
>>      >> ensemblVEP release ? Or would it be possible to include a
>> 'version'
>>      >> parameter in ensemblVEP to switch between parameters supported
>>     by different
>>      >> versions of the perl script ?
>>      >>
>>      >> Thanks a lot for any insights,
>>      >> Thomas
>>      >>
>>      >>   -- output of sessionInfo():
>>      >>
>>      >> NA
>>      >>
>>      >> --
>>      >> Sent via the guest posting facility at bioconductor.org
>>     <http://bioconductor.org>.
>>      >>
>>      >>
>>      > ______________________________**_________________
>>      > Bioconductor mailing list
>>      > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>>      >
>>
>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>
>>      > Search the archives: http://news.gmane.org/gmane.**
>>      >
>>
>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>>      >
>>
>>              [[alternative HTML version deleted]]
>>
>>     _______________________________________________
>>     Bioconductor mailing list
>>     Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>>     https://stat.ethz.ch/mailman/listinfo/bioconductor
>>     Search the archives:
>>     http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Valerie Obenchain

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B155
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: vobencha at fhcrc.org
Phone:  (206) 667-3158
Fax:    (206) 667-1319



More information about the Bioconductor mailing list