[R-pkgs] new version of seqinR

Simon Penel penel at biomserv.univ-lyon1.fr
Tue Apr 24 10:28:06 CEST 2007

Dear useRs,

The seqinR package  is a library of utilities to retrieve and analyse 
biological sequences.

A new version of seqinR, seqinR 1.0-7,  has been released on CRAN.

Here is a summary of changes:

o A new *experimental* function extractseqs() to download
  sequences thru zlib compressed sockets from an ACNUC server is released.
  Preliminary tests suggest that working with about 100,000 CDS is 
possible with
  a home ADSL connection. See the manual chapter 3 page 44 at
  for some system.time() examples.

o As pointed by Emmanuel Prestat the URL used in dia.bactgensize() was no
  more available, this has been fixed in the current version.

o As pointed by Guy Perriere, the function oriloc() was no more compatible
  with glimmer 3.0 outputs. The function has gained a new argument
  glimmer.version defaulting to 3, but the value 2 is still functional for
  backward compatibility with old glimmer outputs.

o As pointed by Lionel Guy there was no default value for the as.string
  argument in the getSequence.SeqFastadna(). A default FALSE value is now
  present for backward compatibility with older code.

o New utility vectorized function stresc() to escape LaTeX special 
  present in a string.

o New low level function readsmj() available.

o A new function readfirstrec() to get the record count of the specified 
  index file is now available.

o Function getType() called without arguments will now use the default 
  database to return available subsequence types.

o Function read.alignment() now also accepts file in addition to File as

o A new function rearranged.oriloc() is available. This method, based on
  oriloc(), can be used to detect the effect of the replication 
mechanism on
  DNA base composition asymmetry, in prokaryotic chromosomes.

o New function extract.breakpoints(), used to extract breakpoints in 
  nucleotide skews. This function uses the segmented package to define the
  position of the breakpoints.

o New function draw.rearranged.oriloc() available, to plot nucleotide skews
  on artificially rearranged prokaryotic chromosomes.

o New function gbk2g2.euk() available. Similarly to gbk2g2(), this function
  extracts the coding sequence annotations from a GenBank format file. This
  function is specifically designed for eukaryotic sequences, i.e. with 
  The output file will contain the coordinates of the exons, along with the
  name of the CDS to which they belong.

o After an e-mail by Marcelo Bertalan on 26 Mar 2007, a bug in oriloc() 
  the gbk argument was NULL was found and fixed by Anamaria Necsulea.

o Functions translate() and getTrans() have gained a new argument 
NAstring to
  represent untranslatable amino-acids, defaulting to character "X".

o There was a typo for the total number of printed bases in the ACNUC 
  474,439 should be 526,506.

o Function invers() has been deleted.

o Functions translate(), getTrans() and comp() have gained a new argument
  ambiguous defaulting to FALSE allowing to handle ambiguous bases. If 
  ambiguous bases are taken into account so that for instance GGN is 
  to Gly in the standard genetic code.

o New function amb() to return the list of nucleotide matching a given 
  nucleotide symbol.

o Function count() has gained a new argument alphabet so that oligopeptides
  counts are now possible. Thanks to Gabriel Valiente for this suggestion.
  The functions zscore(), rho() and summary.SeqFastadna() have also an 
  alphabet which is forwarded to count().


the seqinR team


Simon Penel
Laboratoire de Biometrie et Biologie Evolutive           
Bat 711  -   CNRS UMR 5558  -    Universite Lyon 1              
43 bd du 11 novembre 1918 69622 Villeurbanne Cedex       
Tel:   04 72 43 29 04      Fax:  04 72 43 13 88

More information about the R-packages mailing list