[R] Problem Reading SPlus Dump Into R - Spaces Embedded in Data

Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Dec 30 11:01:58 CET 2005


allan miller <amiller at a2software.com> writes:

> Peter Dalgaard wrote:
> 
> >allan miller <amiller at a2software.com> writes:
> >
> >
> >>Hello,
> >>
> >> I'm trying to source() an SPlus 6.x file created using dump(...,
> >> oldStyle=T) into R (version 2.01) as using the following
> >> instructions:
> >>
> >>
> >>> *If you have access to S-PLUS, it is usually more reliable to
> >>> |dump| the object(s) in S-PLUS and |source| the dumpfile in R. For
> >>> S-PLUS 5.x and 6.x you may need to use |dump(..., oldStyle=T)|,
> >>> and to read in very large objects it may be preferable to use the
> >>> dumpfile as a batch script rather than use the |source| function.*
> >>>
> >>(from "R Data Import/Export," pg. 15)
> >>
> >>An example:
> >>
> >> > source("testdump")
> >>Error in parse(file, n, text, prompt) : syntax error on line 1895
> >>
> >> where the data on line 1895 - and other lines causing this - have
> >> embedded spaces, such as the following:
> >>
> >>
> >>[line 1895]  Johnson Partners LLC
> >>
> >>
> >> I can't seem to find any options for either the SPlus dump, or R
> >> source(), that relate to this problem.  Any suggestions for how to
> >> either dump or source files containing data with embedded spaces?
> >>
> >
> >A bit more context might be helpful. What's in lines surrounding 1895?
> >Can you show a simple S-PLUS object displaying the behaviour? What
> > happens if you dput() the object? Will S-PLUS itself restore the
> > file?
> Unfortunately, I don't have access to S-PLUS :'( , the S-PLUS file
> dump was provided to me to load in R.  Here are the lines in the
> S-PLUS dump surrounding 1895:
> 
> > 1892 .Label
> >    1893 character
> >    1894 1
> >    1895 Johnson Partners LLC
> >    1896 class
> >    1897 character
> >    1898 1
> >    1899 factor
> >    1900 Protocol
> 
> The problem is with the embedded spaces (whitespace?) characters in
> 1895.  If I remove the spaces, i.e., change it to:
> 
> JohnsonPartnersLLC
> 
> the line is successfully loaded, and the next error that comes up is
> another Label with embedded spaces.
> 
> Thanks for your help.

As I suspected, your data are not in the format that you thought they
were. 

turmalin:~/>Splus
S-PLUS : Copyright (c) 1988, 2003 Insightful Corp.
S : Copyright Lucent Technologies, Inc.
Version 6.2.1  for Linux 2.4.18 : 2003
Working data will be in /home/bs/pd/MySwork
> x <- "Johnson Partners LLC"
> dump("x",file="testfile",oldStyle=TRUE)
[1] "testfile"
>
[1]+  Stopped                 Splus
turmalin:~/>cat testfile
"x" <-
"Johnson Partners LLC"

turmalin:~/>fg
Splus

> data.dump("x",file="test2")
>
[1]+  Stopped                 Splus
turmalin:~/>cat test2
## Dump S Version 4 Dump ##
x
character
character
1
Johnson Partners LLC


....

> data.dump("x",oldStyle=TRUE)
>
[2]+  Stopped                 Splus
turmalin:~/>cat dumpdata
x
character
1
Johnson Partners LLC


So what you have looks like the oldStyle (? - check line 1) data.dump()
format, which is quite different from dump().

data.restore() from the foreign package can read those if they contain
only basic data objects. For the oldStyle=F format, you seem to be out
of luck.

(And BTW, R will _parse_ almost any file consisting of lines with just a
single word or a numeric constant. That doesn't mean it can do
anything sensible with it...)


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list