[R] creating a scale (factor) based on a continuous variable nested within levels of factor
hind lazrak
hindstata at gmail.com
Sun Nov 7 17:29:50 CET 2010
Hello Dennis and r-helpers
Thank you very much for your reply.
The problem is solved now even if I don't seen why the command that I
had posted as an alternative solution did not work...
hDatPretty$liking <- by(hDatPretty$rating, hDatPretty$songId,function (z) {
cut(hDatPretty$z, c(-10, -4,4,10),
labels=c('dislike', 'neutral', 'like'))}
Hind
On Sun, Nov 7, 2010 at 1:45 AM, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> If I get your meaning, the cut() function would appear to be your friend in
> this problem.
>
> hDatPretty$liking <- cut(hDatPretty$rating, breaks = c(-11, -4, 4, 11),
> labels = c('dislike', 'neutral',
> 'like'))
>
> HTH,
> Dennis
>
> On Sat, Nov 6, 2010 at 11:15 PM, hind lazrak <hindstata at gmail.com> wrote:
>>
>> Hello R-helpers
>>
>>
>> I hope that my subject line is not detering anyone from helping me out:)
>> I have been stuck of a few hours now, and I don't seem to pinpoint
>> where the problem is.
>>
>>
>> I have a data.frame which is structured as follow:
>> str(hDatPretty)
>> 'data.frame': 1665 obs. of 8 variables:
>> $ time : num 0 1.02 2.05 3.07 4.09 ...
>> $ hr : num 62.4 63.6 64.6 65.5 66.2 ...
>> $ emg : num 3.3 3.42 3.52 3.57 3.6 ...
>> $ respRate: num 50.4 50.6 50.7 50.8 50.9 ...
>> $ scr : num 1.7 1.72 1.73 1.74 1.75 ...
>> $ skinTemp: num 28.1 28.2 28.2 28.2 28.2 ...
>> $ rating : num 4 4 4 4 4 4 4 4 4 4 ...
>> $ songId : Factor w/ 37 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1
>> ...
>>
>> It consists of ratings ($rating) given by people (here the id variable
>> is not indicated as this is a subset with only one person) for each of
>> the 37 songs ($songId) they listen to.
>> While they are listening we measure physiological responses (emg,
>> hr,...) every second over a period of 45 seconds.
>> Here's a quick peek at the data
>> head(hDatPretty)
>>
>> time hr emg respRate scr skinTemp rating songId
>> 1.1 0.000000 62.42135 3.300562 50.40538 1.703105 28.14489 4 1
>> 1.2 1.022727 63.59057 3.424884 50.59292 1.718110 28.16189 4 1
>> 1.3 2.045455 64.59840 3.515219 50.73523 1.730594 28.17836 4 1
>> 1.4 3.068182 65.47707 3.573151 50.83909 1.740594 28.19422 4 1
>> 1.5 4.090909 66.22192 3.597183 50.90466 1.748086 28.20948 4 1
>> 1.6 5.113636 66.89209 3.588530 50.91911 1.753385 28.22414 4 1
>>
>> So, every study participant gives one rating (from -10 to 10) for each
>> song
>> If we tab the data this is what we have (for the first 10 songs)
>> table(hDatPretty$songId, hDatPretty$rating)
>>
>>
>> -10 -9 -7 -3 0 1 3 4 5 7 8 9 10
>> 1 0 0 0 0 0 0 0 45 0 0 0 0 0 # song 1 gets a score of 4
>> 2 0 0 0 0 0 0 45 0 0 0 0 0 0 # song 2 gets a score of 3
>> 3 0 0 45 0 0 0 0 0 0 0 0 0 0 #.
>> 4 0 45 0 0 0 0 0 0 0 0 0 0 0
>> 5 0 0 0 0 0 0 0 0 0 45 0 0 0
>> 6 0 0 0 0 0 0 0 0 0 0 0 0 45
>> 7 0 0 0 0 0 0 0 0 0 0 45 0 0 #song 7 gets a score of 8
>> 8 0 0 0 45 0 0 0 0 0 0 0 0 0
>> 9 0 0 0 0 0 0 0 45 0 0 0 0 0
>> 10 0 0 0 0 0 45 0 0 0 0 0 0 0
>>
>> What I would like to do is to create another scale ( a factor) based
>> on the ratings with the following levels
>> -10;-4 == dislike where -4 is included
>> -4;4 == neutral where -4 is excluded
>> 4;10 == like where 4 is excluded
>>
>> My code to obtain this new variable
>>
>> liking <- numeric(length(hDatPretty$rating))
>> liking[hDatPretty$rating <= -4] <- 'dislike'
>> liking[hDatPretty$rating > -4 & hDatPretty$rating <= 4] <- 'neutral'
>> liking[hDatPretty$rating > 4] <- 'like'
>>
>> hDatPretty['liking']<- factor(liking)
>>
>> The problem that I have is that for some reasons it does assign
>> different values to the same rating for some songs but not all (?)
>> See for example
>>
>> dislike like neutral
>> 1 0 8 37 ## Here is one problem where the song #
>> 1gets two 'liking' scores while the rating is constant
>> 2 0 0 45
>> 3 45 0 0
>> 4 45 0 0
>> 5 0 45 0
>> 6 0 45 0
>> 7 0 45 0
>> 8 0 0 45
>> 9 0 10 35 ## here is a similar problem
>>
>> Could you PLEASE help me with the proper code to obtain my 'liking'
>> variable for each of the song based on the rating each song gets?
>>
>> Many thanks.
>>
>>
>> Hind
>> p.s.: I have also tried the cut() in the code as follow...unsuccesfully
>>
>> hDatPretty$liking <- by(hDatPretty$rating, hDatPretty$songId,
>> function (z) { cut(hDatPretty$z, c(-10, -4,4,10),
>> labels=c('dislike', 'neutral', 'like'))})
>>
>> Error in cut.default(hDatPretty$z, c(-10, -4, 4, 10), labels =
>> c("dislike", :
>> 'x' must be numeric
>>
>> again thank you.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list