[R-pkgs] new version of seqinR
Simon Penel
penel at biomserv.univ-lyon1.fr
Tue Apr 24 10:28:06 CEST 2007
Dear useRs,
The seqinR package is a library of utilities to retrieve and analyse
biological sequences.
A new version of seqinR, seqinR 1.0-7, has been released on CRAN.
Here is a summary of changes:
o A new *experimental* function extractseqs() to download
sequences thru zlib compressed sockets from an ACNUC server is released.
Preliminary tests suggest that working with about 100,000 CDS is
possible with
a home ADSL connection. See the manual chapter 3 page 44 at
http://pbil.univ-lyon1.fr/software/SeqinR/seqinr_1_0-7.pdf
for some system.time() examples.
o As pointed by Emmanuel Prestat the URL used in dia.bactgensize() was no
more available, this has been fixed in the current version.
o As pointed by Guy Perriere, the function oriloc() was no more compatible
with glimmer 3.0 outputs. The function has gained a new argument
glimmer.version defaulting to 3, but the value 2 is still functional for
backward compatibility with old glimmer outputs.
o As pointed by Lionel Guy there was no default value for the as.string
argument in the getSequence.SeqFastadna(). A default FALSE value is now
present for backward compatibility with older code.
o New utility vectorized function stresc() to escape LaTeX special
characters
present in a string.
o New low level function readsmj() available.
o A new function readfirstrec() to get the record count of the specified
ACNUC
index file is now available.
o Function getType() called without arguments will now use the default
ACNUC
database to return available subsequence types.
o Function read.alignment() now also accepts file in addition to File as
argument.
o A new function rearranged.oriloc() is available. This method, based on
oriloc(), can be used to detect the effect of the replication
mechanism on
DNA base composition asymmetry, in prokaryotic chromosomes.
o New function extract.breakpoints(), used to extract breakpoints in
rearranged
nucleotide skews. This function uses the segmented package to define the
position of the breakpoints.
o New function draw.rearranged.oriloc() available, to plot nucleotide skews
on artificially rearranged prokaryotic chromosomes.
o New function gbk2g2.euk() available. Similarly to gbk2g2(), this function
extracts the coding sequence annotations from a GenBank format file. This
function is specifically designed for eukaryotic sequences, i.e. with
introns.
The output file will contain the coordinates of the exons, along with the
name of the CDS to which they belong.
o After an e-mail by Marcelo Bertalan on 26 Mar 2007, a bug in oriloc()
when
the gbk argument was NULL was found and fixed by Anamaria Necsulea.
o Functions translate() and getTrans() have gained a new argument
NAstring to
represent untranslatable amino-acids, defaulting to character "X".
o There was a typo for the total number of printed bases in the ACNUC
books:
474,439 should be 526,506.
o Function invers() has been deleted.
o Functions translate(), getTrans() and comp() have gained a new argument
ambiguous defaulting to FALSE allowing to handle ambiguous bases. If
TRUE,
ambiguous bases are taken into account so that for instance GGN is
translated
to Gly in the standard genetic code.
o New function amb() to return the list of nucleotide matching a given
IUPAC
nucleotide symbol.
o Function count() has gained a new argument alphabet so that oligopeptides
counts are now possible. Thanks to Gabriel Valiente for this suggestion.
The functions zscore(), rho() and summary.SeqFastadna() have also an
argument
alphabet which is forwarded to count().
Best,
the seqinR team
http://pbil.univ-lyon1.fr/software/SeqinR/seqinr_accueil.php
--
Simon Penel
Laboratoire de Biometrie et Biologie Evolutive
Bat 711 - CNRS UMR 5558 - Universite Lyon 1
43 bd du 11 novembre 1918 69622 Villeurbanne Cedex
Tel: 04 72 43 29 04 Fax: 04 72 43 13 88
http://pbil.univ-lyon1.fr/members/penel
More information about the R-packages
mailing list