[R] creating variable that codes for the match/mismatch between two other variables

PIKAL Petr petr.pikal at precheza.cz
Mon Feb 25 16:30:11 CET 2013


Hi

> -----Original Message-----
> From: Jonas Walter [mailto:jonas.walter at student.uni-tuebingen.de]
> Sent: Monday, February 25, 2013 4:25 PM
> To: PIKAL Petr
> Cc: r-help at r-project.org
> Subject: RE: [R] creating variable that codes for the match/mismatch
> between two other variables
> 
> Hi Petr,
> 
> oh, that's really way more easier than the way I did it. Thanks for the
> hint!
> 
> The problem with "no prediction" is that these cases are already coded
> within the "Prediction" variable.
> 
> 0 codes "no prediction required" while 1 and 2 codes for different
> predictions. Therefore, there are no NAs within this variable.
> 
> By applying the procedure suggested by you, I would receive 0-coding
> for both trials with wrong predictions and trials without any
> predictions.
> 
> But probably I can change coding within the Prediction-variable prior
> to applying your procedure.

Change

mydat$Prediction[mydat$Prediction==0] <- NA

and after that you shall get 0 when wrong prediction and NA when no prediction required

Regards
Petr

> 
> Thanks again!
> 
> Best,
> Jonas
> 
> 
> 
> 
> 
> Zitat von PIKAL Petr <petr.pikal at precheza.cz>:
> 
> > Hi
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of Jonas Walter
> >> Sent: Monday, February 25, 2013 2:38 PM
> >> To: r-help at r-project.org
> >> Subject: [R] creating variable that codes for the match/mismatch
> >> between two other variables
> >>
> >>
> >>
> >> Dear all,
> >>
> >> I have got two vectors coding for a stimulus presented in the
> current
> >> trial (mydat$Stimulus) and a prediction in the same trial
> >> (mydat$Prediciton), respectively.
> >> By applying an if-conditional I want to create a new vector that
> >> indicates if there is a match between both vectors in the same
> trial.
> >> That is, if the prediction equals the stimulus.
> >>
> >> When I pick out some trials randomly, I get some trials with no
> match
> >> (mydat$Stimulus[1] != mydat$Prediction[1]) as well as some trials
> >> with a match (mydat$Stimulus[1] == mydat$Prediction[1]).
> >>
> >> However, if I apply the following code, each trial is coded as a
> match.
> >> Why, what do I wrong?
> >>
> >> In some blocks, there was no prediction recorded. Therefore, I want
> >> those trials to be labeled differently [that is, match = 7].
> >>
> >> Coding-legend:
> >>
> >> 1 = match
> >> 0 = no match
> >> 7 = no prediction recorded
> >>
> >> The code:
> >>
> >> # create varialbe that codes match/mismatch of prediction vs.
> >> stimulus
> >>
> >> mydat$match <- 0
> >>
> >> for (i in seq_along(1:nrow(mydat))) {
> >>  # if there is a match, mydat$match[i] = 1        if
> >> (mydat$Stimulus[i] == mydat$Prediction[i]) {
> >>                 mydat$match = 1
> >> # the next to conditions refer to blocks without prediction
> recording.
> >> Therefore, the corresponding trials are coded with mydat$match[i] =
> >> 7. } else if (mydat$BlockOrder[i] == 1 & mydat$Block_nr[i] == 1) {
> >>                 mydat$match = 7
> >>         } else if (mydat$BlockOrder[i] == 2 & mydat$Block_nr[i] ==
> 4) {
> >>                 mydat$match == 7
> >>         }
> >> }
> >
> > Well, why so complicated?
> >
> > (mydat$Stimulus == mydat$Prediction)*1
> >
> > gives you vector of 1 when there is match and 0 when there is no
> match.
> >
> > I do not understand your no prediction though. How is no prediction
> > coded? If NA, the resulting vector will have NA in corresponding item
> > too.
> >
> > Regards
> > Petr
> >
> >
> >>
> >> # The corresponding dataframe structure:
> >>
> >> str(mydat)
> >> 'data.frame':   9302 obs. of  18 variables:
> >> $ BlockOrder       : int  1 1 1 1 1 1 1 1 1 1 ...
> >> $ Block_nr         : num  1 1 1 1 1 1 1 1 1 1 ...
> >> $ Trial_nr         : int  1 2 3 4 5 6 7 8 9 10 ...
> >> $ PreSeq.Length    : int  1 2 2 1 1 2 0 2 2 2 ...
> >> $ PreSeq           : int  21 12 21 20 20 12 0 21 22 11 ...
> >> $ Sequence         : int  121111 121212 121111 121111 112212 121221
> >> 121111 121111 122112 121111 ...
> >> $ Category         : int  2 1 3 2 1 1 3 3 1 3 ...
> >> $ FixCross.Latency : int  1429 1043 1093 1297 1155 1449 1140 1396
> >> 1341
> >> 1427 ...
> >> $ Stimulus         : int  2 1 2 2 1 1 1 1 2 1 ...
> >> $ RT               : int  333 275 378 428 442 388 340 394 414 542
> ...
> >> $ RT.Button_pressed: int  2 1 2 2 1 1 1 1 2 1 ...
> >> $ RT.Accuracy      : int  1 1 1 1 1 1 1 1 1 1 ...
> >> $ Prediction       : int  0 0 0 0 0 0 0 0 0 0 ...
> >> $ Confidence       : int  0 0 0 0 0 0 0 0 0 0 ...
> >> $ ITI              : int  1053 1182 1467 1431 1103 1170 1232 1393
> 1356
> >> 1495 ...
> >> $ Subject          : num  4 4 4 4 4 4 4 4 4 4 ...
> >> $ ITruns           : num  0 0 0 1 0 1 2 3 0 0 ...
> >> $ match            : num  1 1 1 1 1 1 1 1 1 1 ...
> >>
> >> # mydat$match, the new variable, only contains ones.
> >>
> >> min(mydat$match)
> >> [1] 1
> >> > max(mydat$match)
> >> [1] 1
> >>
> >> # example: row 1699: no match Stimulus - Prediction
> >>
> >> mydat$Stimulus[1699] == mydat$Prediction[1699] # [1] FALSE
> >>
> >> # but:
> >>
> >> mydat$match[1699]
> >> # [1] 1
> >>
> >> How can I get the right coding? Where is the mistake?
> >>
> >> Thanks!
> >>
> >> Best,
> >> Jonas Walter
> >>
> >>
> >> 	[[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html and provide commented, minimal, self-contained,
> >> reproducible code.
> >
> 



More information about the R-help mailing list