[R] Importing data coming from Splus into R.

William Dunlap wdunlap at tibco.com
Fri Feb 5 18:37:25 CET 2010


For a data.frame with only numeric and factor
columns using dump() on the S+ end and source()
on the R end ought to work.  If you have timeDate
columns you will need to convert them to character
data before exporting and convert them to your
favorite R time/date class after importing them.

If you could send me a fairly small sample of your
data that shows the incompatibility between S+'s
write.table and R's read.table I could try to fix
things up so they were more compatible.

Code that reads the S+ native binary format must
be 32/64 bit aware, since S+ integers are 32 bits
on 32-bit versions of S+ and 64 bits on 64-bit
versions.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Uwe Ligges
> Sent: Friday, February 05, 2010 8:05 AM
> To: Gerald Jean
> Cc: r-help at r-project.org
> Subject: Re: [R] Importing data coming from Splus into R.
> 
> 1. I am stuck with a copy of S-PLUS 4.x. At that time I used 
> dump() in 
> S-PLUS and source() to get things into R afterwards ...
> 
> 2. Why do you think that 32-bit vs. 64-bit issues matter? The file 
> format does not change (well, this is guessed since I do not have any 
> 64-bit S-PLUS version available).
> 
> Best,
> Uwe Ligges
> 
> 
> On 05.02.2010 16:35, gerald.jean at dgag.ca wrote:
> >
> > Hello there,
> >
> > I spent all day yesterday trying to get a small data set 
> from Splus into R,
> > no luck!  Both, Splus and R, are run on a 64-bit RedHat 
> Linux machine, the
> > versions of the softwares are 64-bit and are as what follows:
> >
> > Splus:
> > TIBCO Software Inc. Confidential Information
> > Copyright (c) 1988-2008 TIBCO Software Inc. ALL RIGHTS RESERVED.
> > TIBCO Spotfire S+ Version 8.1.1 for Linux 2.6.9-34.EL, 64-bit : 2008
> >
> > R:
> > R version 2.8.0 (2008-10-20)
> > Copyright (C) 2008 The R Foundation for Statistical Computing
> > ISBN 3-900051-07-0
> >
> > I know that the "foreign" package has a function to 
> directly import Splus
> > data sets into R, but I also know that it is working only for 32-bit
> > versions of the softwares, hence I didn't try that route.  
> Here is what I
> > have done:
> >
> > In Splus:
> >
> > ttt<- exportData(data = FMD.CR.test,
> >                    file = 
> "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> >                    type = "ASCII", delimiter = "@", quote = 
> T, na.string =
> > "NA")
> > ttt.class<- unlist(lapply(FMD.CR.test, class))
> >
> > ### I am using "@" as delimiter since some factor levels 
> contain both the
> > "," and the ";".
> >
> > In R:
> >
> > FMD.CR.test.fields<- count.fields(file =
> > "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> >                                     sep = "@", quote = 
> "\"", comment.char =
> > "")
> > all(FMD.CR.test.fields == 327)
> > [1] TRUE  ## Hence all observations have the same number of 
> fields, so far,
> > so good!
> >
> > FMD.CR.test.classes<- c("factor", "character", "factor", "factor",
> > "factor",
> >                           "factor", "factor", "factor", 
> "factor", "factor",
> >                           "factor", "numeric", "character", 
> and so on)
> > names(FMD.CR.test.classes)<- c("RTA","police", "mnt.rent.bnct",
> >                           "mnt.rent.boni", "mnt.rent.cred.bnct",
> >                           "mnt.rent.epar.bnct", "mnt.rent.snbn",
> >                           "mnt.rent.trxl", "solde.eop", 
> "solde.nenr.es",
> >                           "solde.enr.es", "num.enreg", 
> "trouve", and so on)
> > FMD.CR.test<-
> >      read.table(file = 
> "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> >                 header = TRUE, sep = "@", quote = "\"", 
> as.is = FALSE,
> >                 strip.white = FALSE, comment.char = "", 
> na.strings = "NA",
> >                 nrows = 65000, colClasses = FMD.CR.test.classes)
> > dim(FMD.CR.test)
> > [1] 64093   327  ## OK
> >
> > ### Testing if classes are the same as the Splus classes.
> >
> > FMD.CR.test.R.classes<- apply(FMD.CR.test, 2, FUN = class)
> > sum(FMD.CR.test.R.classes == FMD.CR.test.classes)
> > [1] 79  ## Not exactly what I was expecting!
> > all(FMD.CR.test.R.classes == "character")
> > [1] TRUE
> >
> > Hence all variables were imported as character, which I find very
> > inconvenient; since the data set has a few hundred factor variables
> > recoding them is a lot of work, this work has already been 
> done in Splus;
> > furthermore, the numeric variables would need conversion as well.
> >
> > I tried all combinations of the arguments "as.is", 
> "stringsAsFactors" and
> > "colClasses" to no avail.  I also tried to export the data 
> set in SAS
> > transport format from Splus and read it through the 
> foreign's read.xport
> > function, always the same result, everything is imported as 
> character.  I
> > search the r-help archives, I found several messages 
> relating this problem
> > but no satisfactory solution!
> >
> > I am a long time user of Splus and I am planning to use R 
> more often,
> > mainly due to its wealth of packages and the convenience of 
> installing
> > them.  I hope to find a reliable and convivial way of 
> transferring data
> > between the two cousins pieces of software.
> >
> > Thanks for any insights,
> >
> > Gérald Jean
> > Conseiller senior en statistiques,
> > VP Planification et Développement des Marchés,
> > Desjardins Groupe d'Assurances Générales
> > télephone            : (418) 835-4900 poste (7639)
> > télecopieur          : (418) 835-6657
> > courrier électronique: gerald.jean at dgag.ca
> >
> > "In God we trust, all others must bring data"  W. Edwards Deming
> >
> >
> >
> >
> >
> > Le message ci-dessus, ainsi que les documents 
> l'accompagnant, sont destinés
> > uniquement aux personnes identifiées et peuvent contenir 
> des informations
> > privilégiées, confidentielles ou ne pouvant être 
> divulguées. Si vous avez
> > reçu ce message par erreur, veuillez le détruire.
> >
> > This communication ( and/or the attachments ) is intended for named
> > recipients only and may contain privileged or confidential 
> information
> > which is not to be disclosed. If you received this 
> communication by mistake
> > please destroy all copies.
> >
> >
> >
> >
> > Faites bonne impression et imprimez seulement au besoin !
> > Think green before you print !
> >
> > Le message ci-dessus, ainsi que les documents 
> l'accompagnant, sont destinés uniquement aux personnes 
> identifiées et peuvent contenir des informations 
> privilégiées, confidentielles ou ne pouvant être divulguées. 
> Si vous avez reçu ce message par erreur, veuillez le détruire.
> >
> > This communication (and/or the attachments) is intended for 
> named recipients only and may contain privileged or 
> confidential information which is not to be disclosed. If you 
> received this communication by mistake please destroy all copies.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list