[R] Value Labels: SPSS Dataset to R

Yawo Kokuvi y@wo1964 @end|ng |rom gm@||@com
Sat Feb 8 15:35:04 CET 2020


Thanks so much for all your assistance.  I admit R's learning curve is a
bit steep, but I am eager to learn ... and hopefully teach with it.

with regard to my problem, I can now see two options:  either declare each
categorical variable as factors, specifying the needed levels and labels.

OR

use a different function (read_spss) as John has suggested to import the
file.

I will experiment with both.

With much appreciation, cY

On Sat, Feb 8, 2020 at 9:25 AM John Kane <jrkrideau using gmail.com> wrote:

> Hi Yawo Kokuvi;
> As an R newbie transitioning from SPSS to R expect culture shock and the
> possible feeling that yor brain is twisting within your skull but it is
> well worth.
>
> Try something like this:
> ##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> dat1  <- structure(list(Animal = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0), label = "Animal", labels = c(Cat = 0, Dog = 1), class =
> "haven_labelled"),
>     Training = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), label = "Type of
> Training", labels = c(`Food as Reward` = 0,
>     `Affection as Reward` = 1), class = "haven_labelled"), Dance =
> structure(c(1,
>     1, 1, 1, 1, 1, 1, 1, 1, 1), label = "Did they dance?", labels = c(No =
> 0,
>     Yes = 1), class = "haven_labelled")), row.names = c(NA, -10L
> ), class = c("tbl_df", "tbl", "data.frame"))
>
>
> library(sjlabelled)
> str(dat1)
> get_labels(dat1)
> barplot(table(as_label(dat1$Dance)))
> ##==================================================================
> Your problem sees to be omitting the as_label().
>
> You do not need to load "haven"
> read_spss() in sjlabelled should do the trick.
>
>
> On Sat, 8 Feb 2020 at 05:44, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
>> Hello,
>>
>> Try
>>
>> aux_fun <- function(x){
>>    levels <- attr(x, "labels")
>>    factor(x, labels = names(levels), levels = levels)
>> }
>>
>> newCatsDogs <- as.data.frame(lapply(CatsDogs, aux_fun))
>>
>> str(newCatsDogs)
>> #'data.frame':  10 obs. of  3 variables:
>> # $ Animal  : Factor w/ 2 levels "Cat","Dog": 1 1 1 1 1 1 1 1 1 1
>> # $ Training: Factor w/ 2 levels "Food as Reward",..: 1 1 1 1 1 1 1 1 1 1
>> # $ Dance   : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2
>>
>>
>> As for the
>>   - frequencies: ?table, ?tapply, ?aggregate,
>>   - barplots: ?barplot
>>
>> You can find lots and lots of examples online of both covering what
>> seems to simple use cases.
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Às 06:03 de 08/02/20, Yawo Kokuvi escreveu:
>> > Thanks for all. Here is output from dput.  I used a different dataset
>> > containing categorical variables since the previous one is on a
>> different
>> > computer.
>> >
>> > In the following dataset, my interest is in getting frequencies and
>> > barplots for the two variables: Training and Dance, with value labels
>> > displayed.
>> >
>> > thanks again - cY
>> >
>> >
>> > =========
>> > dput(head(CatsDogs, n = 10))
>> > structure(
>> >    list(
>> >      Animal = structure(
>> >        c(0, 0, 0, 0, 0, 0, 0, 0, 0,
>> >          0),
>> >        label = "Animal",
>> >        labels = c(Cat = 0, Dog = 1),
>> >        class = "haven_labelled"
>> >      ),
>> >      Training = structure(
>> >        c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
>> >        label = "Type of Training",
>> >        labels = c(`Food as Reward` = 0,
>> >                   `Affection as Reward` = 1),
>> >        class = "haven_labelled"
>> >      ),
>> >      Dance = structure(
>> >        c(1,
>> >          1, 1, 1, 1, 1, 1, 1, 1, 1),
>> >        label = "Did they dance?",
>> >        labels = c(No = 0,
>> >                   Yes = 1),
>> >        class = "haven_labelled"
>> >      )
>> >    ),
>> >    row.names = c(NA,-10L),
>> >    class = c("tbl_df", "tbl", "data.frame")
>> > )
>> >
>> >
>> > On Fri, Feb 7, 2020 at 10:14 PM Bert Gunter <bgunter.4567 using gmail.com>
>> wrote:
>> >
>> >> Yes. Most attachments are stripped by the server.
>> >>
>> >> Bert Gunter
>> >>
>> >> "The trouble with having an open mind is that people keep coming along
>> and
>> >> sticking things into it."
>> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >>
>> >>
>> >> On Fri, Feb 7, 2020 at 5:34 PM John Kane <jrkrideau using gmail.com> wrote:
>> >>
>> >>> Hi,
>> >>> Could you upload some sample data in dput form?  Something like
>> >>> dput(head(Scratch, n=13)) will give us some real data to examine. Just
>> >>> copy
>> >>> and paste the output of dput(head(Scratch, n=13))into the email. This
>> is
>> >>> the best way to ensure that R-help denizens are getting the data in
>> the
>> >>> exact format that you have.
>> >>>
>> >>> On Fri, 7 Feb 2020 at 15:32, Yawo Kokuvi <yawo1964 using gmail.com> wrote:
>> >>>
>> >>>> Thanks for all your assistance
>> >>>>
>> >>>> Attached please is the Rdata scratch I have been using
>> >>>>
>> >>>> -----------------------------------------------------
>> >>>>
>> >>>>> head(Scratch, n=13)
>> >>>> # A tibble: 13 x 6
>> >>>>        ID           marital        sex      race    paeduc    speduc
>> >>>>     <dbl>         <dbl+lbl>  <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl>
>> >>>>   1     1 3 [DIVORCED]      1 [MALE]   1 [WHITE]        NA        NA
>> >>>>   2     2 1 [MARRIED]       1 [MALE]   1 [WHITE]        NA        NA
>> >>>>   3     3 3 [DIVORCED]      1 [MALE]   1 [WHITE]         4        NA
>> >>>>   4     4 4 [SEPARATED]     1 [MALE]   1 [WHITE]        16        NA
>> >>>>   5     5 3 [DIVORCED]      1 [MALE]   1 [WHITE]        18        NA
>> >>>>   6     6 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        14        20
>> >>>>   7     7 1 [MARRIED]       2 [FEMALE] 2 [BLACK]        NA        12
>> >>>>   8     8 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        NA        12
>> >>>>   9     9 3 [DIVORCED]      2 [FEMALE] 1 [WHITE]        11        NA
>> >>>> 10    10 1 [MARRIED]       2 [FEMALE] 1 [WHITE]        16        12
>> >>>> 11    11 5 [NEVER MARRIED] 2 [FEMALE] 2 [BLACK]        NA        NA
>> >>>> 12    12 3 [DIVORCED]      2 [FEMALE] 2 [BLACK]        NA        NA
>> >>>> 13    13 3 [DIVORCED]      2 [FEMALE] 2 [BLACK]        16        NA
>> >>>>
>> >>>> -----------------------------------------------------
>> >>>>
>> >>>> and below is my script/command file.
>> >>>>
>> >>>> *#1: Load library and import SPSS dataset*
>> >>>> library(haven)
>> >>>> Scratch <- read_sav("~/Desktop/Scratch.sav")
>> >>>>
>> >>>> *#2: save the dataset with a name*
>> >>>> save(ScratchImport, file="Scratch.Rdata")
>> >>>>
>> >>>> *#3: install & load necessary packages for descriptive statistics*
>> >>>> install.packages ("freqdist")
>> >>>> library (freqdist)
>> >>>>
>> >>>> install.packages ("sjlabelled")
>> >>>> library (sjlabelled)
>> >>>>
>> >>>> install.packages ("labelled")
>> >>>> library (labelled)
>> >>>>
>> >>>> install.packages ("surveytoolbox")
>> >>>> library (surveytoolbox)
>> >>>>
>> >>>> *#4: Check the value labels of gender and marital status*
>> >>>> Scratch$sex %>% attr('labels')
>> >>>> Scratch$marital %>% attr('labels')
>> >>>>
>> >>>> *#5:  Frequency Distribution and BarChart for Categorical/Ordinal
>> Level
>> >>>> Variables such as Gender - SEX*
>> >>>> freqdist(Scratch$sex)
>> >>>> barplot(table(Scratch$marital))
>> >>>>
>> >>>> -----------------------------------------------------
>> >>>>
>> >>>> As you can see from above, I use the <haven> package to import the
>> data
>> >>>> from SPSS.  Apparently, the haven function keeps the value labels, as
>> >>> the
>> >>>> attribute options in section #4 of my script shows.
>> >>>> The problem is that when I run frequency distribution for any of the
>> >>>> categorical variables like sex or marital status, only the numbers
>> (1,
>> >>> 2,)
>> >>>> are displayed in the output.  The labels (male, female) for example
>> are
>> >>>> not.
>> >>>>
>> >>>> Is there any way to force these to be shown in the output?  Is there
>> a
>> >>>> global property that I have to set so that these value labels are
>> >>> reliably
>> >>>> displayed with every output?  I read I can declare them as factors
>> using
>> >>>> the <as_factor()>, but once I do so, how do I invoke them in my
>> >>> commands so
>> >>>> that the value labels show...
>> >>>>
>> >>>> Sorry about all the noobs questions, but Ihopefully, I am able to get
>> >>> this
>> >>>> working.
>> >>>>
>> >>>> Thanks in advance.
>> >>>>
>> >>>>
>> >>>> Thanks - cY
>> >>>>
>> >>>>
>> >>>> On Fri, Feb 7, 2020 at 1:14 PM <cpolwart using chemo.org.uk> wrote:
>> >>>>
>> >>>>> I've never used it, but there is a labels function in haven...
>> >>>>>
>> >>>>> On 7 Feb 2020 17:05, Bert Gunter <bgunter.4567 using gmail.com> wrote:
>> >>>>>
>> >>>>> What does your data look like after importing? -- see ?head and ?str
>> >>> to
>> >>>>> tell us. Show us the code that failed to provide "labels." See the
>> >>>> posting
>> >>>>> guide below for how to post questions that are likely to elicit
>> >>> helpful
>> >>>>> responses.
>> >>>>>
>> >>>>> I know nothing about the haven package, but see ?factor or go
>> through
>> >>> an
>> >>>> R
>> >>>>> tutorial or two to learn about factors, which may be part of the
>> issue
>> >>>>> here. R *generally* obtains whatever "label" info it needs from the
>> >>>> object
>> >>>>> being tabled -- see ?tabulate, ?table etc. -- if that's what you're
>> >>>> doing.
>> >>>>>
>> >>>>> Bert Gunter
>> >>>>>
>> >>>>> "The trouble with having an open mind is that people keep coming
>> along
>> >>>> and
>> >>>>> sticking things into it."
>> >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 using gmail.com>
>> >>> wrote:
>> >>>>>
>> >>>>>> Hello,
>> >>>>>>
>> >>>>>> I am just transitioning from SPSS to R.
>> >>>>>>
>> >>>>>> I used the haven library to import some of my spss data files to R.
>> >>>>>>
>> >>>>>> However, when I run procedures such as frequencies or crosstabs,
>> >>> value
>> >>>>>> labels for categorical variables such as gender (1=male, 2=female)
>> >>> are
>> >>>>> not
>> >>>>>> shown. The same applies to many other output.
>> >>>>>>
>> >>>>>> I am confused.
>> >>>>>>
>> >>>>>> 1. Is there a global setting that I can use to force all
>> categorical
>> >>>>>> variables to display labels?
>> >>>>>>
>> >>>>>> 2. Or, are these labels to be set for each function or package?
>> >>>>>>
>> >>>>>> 3. How can I request the value labels for each function I run?
>> >>>>>>
>> >>>>>> Thanks in advance for your help..
>> >>>>>>
>> >>>>>> Best, Yawo
>> >>>>>>
>> >>>>>>          [[alternative HTML version deleted]]
>> >>>>>>
>> >>>>>> ______________________________________________
>> >>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>>> PLEASE do read the posting guide
>> >>>>>> http://www.R-project.org/posting-guide.html
>> >>>>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>>>>
>> >>>>>
>> >>>>> [[alternative HTML version deleted]]
>> >>>>>
>> >>>>> ______________________________________________
>> >>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>> PLEASE do read the posting guide
>> >>>>> http://www.R-project.org/posting-guide.html
>> >>>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>          [[alternative HTML version deleted]]
>> >>>>
>> >>>> ______________________________________________
>> >>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>> PLEASE do read the posting guide
>> >>>> http://www.R-project.org/posting-guide.html
>> >>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> John Kane
>> >>> Kingston ON Canada
>> >>>
>> >>>          [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> John Kane
> Kingston ON Canada
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list