[R] Importing data coming from Splus into R.
William Dunlap
wdunlap at tibco.com
Fri Feb 5 18:37:25 CET 2010
For a data.frame with only numeric and factor
columns using dump() on the S+ end and source()
on the R end ought to work. If you have timeDate
columns you will need to convert them to character
data before exporting and convert them to your
favorite R time/date class after importing them.
If you could send me a fairly small sample of your
data that shows the incompatibility between S+'s
write.table and R's read.table I could try to fix
things up so they were more compatible.
Code that reads the S+ native binary format must
be 32/64 bit aware, since S+ integers are 32 bits
on 32-bit versions of S+ and 64 bits on 64-bit
versions.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Uwe Ligges
> Sent: Friday, February 05, 2010 8:05 AM
> To: Gerald Jean
> Cc: r-help at r-project.org
> Subject: Re: [R] Importing data coming from Splus into R.
>
> 1. I am stuck with a copy of S-PLUS 4.x. At that time I used
> dump() in
> S-PLUS and source() to get things into R afterwards ...
>
> 2. Why do you think that 32-bit vs. 64-bit issues matter? The file
> format does not change (well, this is guessed since I do not have any
> 64-bit S-PLUS version available).
>
> Best,
> Uwe Ligges
>
>
> On 05.02.2010 16:35, gerald.jean at dgag.ca wrote:
> >
> > Hello there,
> >
> > I spent all day yesterday trying to get a small data set
> from Splus into R,
> > no luck! Both, Splus and R, are run on a 64-bit RedHat
> Linux machine, the
> > versions of the softwares are 64-bit and are as what follows:
> >
> > Splus:
> > TIBCO Software Inc. Confidential Information
> > Copyright (c) 1988-2008 TIBCO Software Inc. ALL RIGHTS RESERVED.
> > TIBCO Spotfire S+ Version 8.1.1 for Linux 2.6.9-34.EL, 64-bit : 2008
> >
> > R:
> > R version 2.8.0 (2008-10-20)
> > Copyright (C) 2008 The R Foundation for Statistical Computing
> > ISBN 3-900051-07-0
> >
> > I know that the "foreign" package has a function to
> directly import Splus
> > data sets into R, but I also know that it is working only for 32-bit
> > versions of the softwares, hence I didn't try that route.
> Here is what I
> > have done:
> >
> > In Splus:
> >
> > ttt<- exportData(data = FMD.CR.test,
> > file =
> "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> > type = "ASCII", delimiter = "@", quote =
> T, na.string =
> > "NA")
> > ttt.class<- unlist(lapply(FMD.CR.test, class))
> >
> > ### I am using "@" as delimiter since some factor levels
> contain both the
> > "," and the ";".
> >
> > In R:
> >
> > FMD.CR.test.fields<- count.fields(file =
> > "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> > sep = "@", quote =
> "\"", comment.char =
> > "")
> > all(FMD.CR.test.fields == 327)
> > [1] TRUE ## Hence all observations have the same number of
> fields, so far,
> > so good!
> >
> > FMD.CR.test.classes<- c("factor", "character", "factor", "factor",
> > "factor",
> > "factor", "factor", "factor",
> "factor", "factor",
> > "factor", "numeric", "character",
> and so on)
> > names(FMD.CR.test.classes)<- c("RTA","police", "mnt.rent.bnct",
> > "mnt.rent.boni", "mnt.rent.cred.bnct",
> > "mnt.rent.epar.bnct", "mnt.rent.snbn",
> > "mnt.rent.trxl", "solde.eop",
> "solde.nenr.es",
> > "solde.enr.es", "num.enreg",
> "trouve", and so on)
> > FMD.CR.test<-
> > read.table(file =
> "/home/jeg002/splus/R/Exemples/R/FMD-CR-test.csv",
> > header = TRUE, sep = "@", quote = "\"",
> as.is = FALSE,
> > strip.white = FALSE, comment.char = "",
> na.strings = "NA",
> > nrows = 65000, colClasses = FMD.CR.test.classes)
> > dim(FMD.CR.test)
> > [1] 64093 327 ## OK
> >
> > ### Testing if classes are the same as the Splus classes.
> >
> > FMD.CR.test.R.classes<- apply(FMD.CR.test, 2, FUN = class)
> > sum(FMD.CR.test.R.classes == FMD.CR.test.classes)
> > [1] 79 ## Not exactly what I was expecting!
> > all(FMD.CR.test.R.classes == "character")
> > [1] TRUE
> >
> > Hence all variables were imported as character, which I find very
> > inconvenient; since the data set has a few hundred factor variables
> > recoding them is a lot of work, this work has already been
> done in Splus;
> > furthermore, the numeric variables would need conversion as well.
> >
> > I tried all combinations of the arguments "as.is",
> "stringsAsFactors" and
> > "colClasses" to no avail. I also tried to export the data
> set in SAS
> > transport format from Splus and read it through the
> foreign's read.xport
> > function, always the same result, everything is imported as
> character. I
> > search the r-help archives, I found several messages
> relating this problem
> > but no satisfactory solution!
> >
> > I am a long time user of Splus and I am planning to use R
> more often,
> > mainly due to its wealth of packages and the convenience of
> installing
> > them. I hope to find a reliable and convivial way of
> transferring data
> > between the two cousins pieces of software.
> >
> > Thanks for any insights,
> >
> > Gérald Jean
> > Conseiller senior en statistiques,
> > VP Planification et Développement des Marchés,
> > Desjardins Groupe d'Assurances Générales
> > télephone : (418) 835-4900 poste (7639)
> > télecopieur : (418) 835-6657
> > courrier électronique: gerald.jean at dgag.ca
> >
> > "In God we trust, all others must bring data" W. Edwards Deming
> >
> >
> >
> >
> >
> > Le message ci-dessus, ainsi que les documents
> l'accompagnant, sont destinés
> > uniquement aux personnes identifiées et peuvent contenir
> des informations
> > privilégiées, confidentielles ou ne pouvant être
> divulguées. Si vous avez
> > reçu ce message par erreur, veuillez le détruire.
> >
> > This communication ( and/or the attachments ) is intended for named
> > recipients only and may contain privileged or confidential
> information
> > which is not to be disclosed. If you received this
> communication by mistake
> > please destroy all copies.
> >
> >
> >
> >
> > Faites bonne impression et imprimez seulement au besoin !
> > Think green before you print !
> >
> > Le message ci-dessus, ainsi que les documents
> l'accompagnant, sont destinés uniquement aux personnes
> identifiées et peuvent contenir des informations
> privilégiées, confidentielles ou ne pouvant être divulguées.
> Si vous avez reçu ce message par erreur, veuillez le détruire.
> >
> > This communication (and/or the attachments) is intended for
> named recipients only and may contain privileged or
> confidential information which is not to be disclosed. If you
> received this communication by mistake please destroy all copies.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list