[R] importing from Stata

John Fox jfox at mcmaster.ca
Tue Jan 17 02:42:41 CET 2006

Dear Dimitri,

I don't have a solution for your problem, but your comment about factor
levels isn't the source of the problem. Factors are stored as integer vector
with a "levels" attribute (try, e.g., unclassing the factor), so the level
names are not repeated.


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dimitri Joe
> Sent: Monday, January 16, 2006 4:30 PM
> To: R-Help
> Subject: [R] importing from Stata
> Hi,
> I have a new job, and everyone here uses Stata. I won't give 
> up on R, but I must learn better how to exchange data between 
> the two softwares. 
> I am now focusing on importing data from Stata to R, and I 
> must confess that I am a bit disappointed with the read.dta 
> function from the foreign package because IT typically happens that
> (i) I get a big R file (for example, a 15Mb Stata file became 
> a 42Mb R file; after cleanup.import() from the Hmisc package, 
> it drooped to 35Mb, but that's still more than 2x the 
> original Stata file) which, in turn, I suspect is due the fact that
> (ii) factors are created using Stata labels as levels.
> I wonder if
> (i) there isn't a way of forcing each variable to be numeric 
> or integer, maintaining it's original values (instead of 
> "Stata labels" as "R levels"). Or,
> (ii) some one has written another function/s to carry this task.
> I'd appreciate any suggestions on how to import from Stata to 
> R more efficiently.
> Thanks in advance,
> Dimitri
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

More information about the R-help mailing list