Variable lables (was Re: [R] Reading SAS version 8 data into

fharrell@virginia.edu fharrell at virginia.edu
Fri Aug 24 13:22:13 CEST 2001


Martyn Plummer wrote:
> 
> On 24-Aug-2001 Prof Brian D Ripley wrote:
> > On Fri, 24 Aug 2001 pauljohn at ukans.edu wrote:
> >
> >> I will try this method to export a sas file, but reading it made
> >> me wonder about "variable lables" and "value lables" in R.  In
> >> SAS and SPSS, the lables are a huge chunk of code and people
> >> want to hang onto them.  In case you have not used data from the
> >> Universty of Michigan's ICPSR, you might not have seen how
> >> elaborate this can get.  Here's a link to a SAS program that
> >> reads in an ascii dataset. It has thousands of lables:
> >>
> >> http://lark.cc.ukans.edu/~pauljohn/sa2684.gz
> >>
> >> (This is a famous one, the American National Election Study)
> >>
> >> Netscape unzips this and shows it as text on the screen.
> >>
> >> A program like SAS or SPSS will use these lables to beautify
> >> frequencies and such, and I've not heard much in the R group
> >> about it, and I just wondered if you do ever talk about it.
> >
> > Because it's no big deal. Those are factor levels.  R has factors.
> > Whether they get exported from SAS and converted by read.xpt I can't say.
> 
> Preserving value labels from SAS datasets is not as easy as it should be.
> 
> SAS value labels are not part of the dataset, but are kept in a separate
> file called a format catalogue. The XPORT engine does not work with SAS
> catalogues, so you need to convert the format catalogue to a SAS database.
> You can do this with the cntlout option in PROC FORMAT. [Conversely the
> cntlin option creates a format catalogue from a database.]

Soon when a beta of the Hmisc library is available you can use
sas.get to handle both variable labels and value labels (i.e.,
those created by PROC FORMAT that were NOT ranges).  Another
option will be to use the cleanup.import function which allows
you to specify a data frame created from importing the PROC
FORMAT CNTLOUT= dataset to S.  This will link variable labels
with variables in the primary data frame.

I store variable labels as "label" attributes of vectors
and use then in various plotting functions as well as the
describe() function.

> 
> We use a program called Stat/Transfer to convert between different
> file formats.  Recent versions of Stat/Transfer will preserve SAS value
> labels if you supply a format dataset.  [It  doesn't support R, but you
> can get from SAS to R via Stata]. I suppose that you could get read.xport
> to work the same way ...
> 
> SAS value labels are not quite the same as S factor labels since
> the mapping from values to labels may be many-to-one.  For example, you
> can categorize a continuous variable by supplying ranges of values to be
> given the same label.  The variable is then treated like a categorical
> variable in tabulations, etc. but the underlying values are preserved
> in the dataset and may be recovered by changing the format.
> 
> Martyn
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list