[R] Conditional looping over a set of variables in R
David Herzberg
davidh at wpspublish.com
Wed Oct 27 15:21:50 CEST 2010
Peter, thanks for this elegant solution that works well and handles the empty cases. However, the vector it returns includes both the row (case) numbers and the target result (number of column of first "1"). How can I strip out the row numbers and leave only the target result.
Regards,
David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: davidh at wpspublish.com
-----Original Message-----
From: Peter Ehlers [mailto:ehlers at ucalgary.ca]
Sent: Tuesday, October 26, 2010 9:23 AM
To: David Herzberg
Cc: Petr PIKAL; r-help at r-project.org
Subject: Re: [R] Conditional looping over a set of variables in R
I would still recommend
vector_of_column_number <- apply(yourdata, 1, match, x=1)
as the simplest way if you only want the number of the column that has the first 1 or "1" (the call works as is for both numeric and character data). Rows which have no 1s will return a value of NA.
Anything wrong with it?
-Peter Ehlers
On 2010-10-26 07:50, David Herzberg wrote:
>
> Thank you - I will try this solution as well.
>
> Sent via DROID X
>
>
> -----Original message-----
> From: Petr PIKAL<petr.pikal at precheza.cz>
> To: David Herzberg<davidh at wpspublish.com>
> Cc: Adrienne Wootten<amwootte at ncsu.edu>,
> "r-help at r-project.org"<r-help at r-project.org>
> Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00
> Subject: Re: [R] Conditional looping over a set of variables in R
>
> Hi
>
> r-help-bounces at r-project.org napsal dne 25.10.2010 20:41:55:
>
>> Adrienne, there's one glitch when I implement your solution below.
>> When
> the
>> loop encounters a case with no data at all (that is, all 140 item
> responses
>> are missing), it aborts and prints this error message: " ERROR:
>> argument
> is
>> of length zero".
>>
>> I wonder if there's a logical condition I could add that would enable
>> R
> to
>> skip these empty cases and continue executing on the next case that
> contains data.
>>
>> Thanks, Dave
>>
>> David S. Herzberg, Ph.D.
>> Vice President, Research and Development Western Psychological
>> Services
>> 12031 Wilshire Blvd.
>> Los Angeles, CA 90025-1251
>> Phone: (310)478-2061 x144
>> FAX: (310)478-7838
>> email: davidh at wpspublish.com
>>
>>
>>
>> From: wootten.adrienne at gmail.com [mailto:wootten.adrienne at gmail.com]
>> On
> Behalf
>> Of Adrienne Wootten
>> Sent: Friday, October 22, 2010 9:09 AM
>> To: David Herzberg
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Conditional looping over a set of variables in R
>>
>> David,
>>
>> here I'm referring to your data as testmat, a matrix of 140 columns
>> and
> 1500
>> rows, but the same or similar notation can be applied to data frames
>> in
> R. If
>> I understand correctly, you are looking for the first response
>> (column)
> where
>> you got a value of 1. I'm assuming also that since your missing
>> values
> are
>> characters then your two numeric values are also characters. keeping
> all this
>> in mind, try something like this.
>
> If you really only want to know which column in each row has first
> occurrence of 1 (or any other value) you can get rid of looping and
> use other R capabilities.
>
>> set.seed(111)
>> mat<-matrix(sample(1:3, 20, replace=T),5,4) mat
> [,1] [,2] [,3] [,4]
> [1,] 2 2 2 2
> [2,] 3 1 2 1
> [3,] 2 2 1 3
> [4,] 2 2 1 1
> [5,] 2 1 1 2
>> mat.w<-which(mat==1, arr.ind=T)
>> tapply(mat.w[,2], mat.w[,1], min)
> 2 3 4 5
> 2 3 3 2
>> mat[2, ]<-NA
>> mat
> [,1] [,2] [,3] [,4]
> [1,] 2 2 2 2
> [2,] NA NA NA NA
> [3,] 2 2 1 3
> [4,] 2 2 1 1
> [5,] 2 1 1 2
>
> and this approach smoothly works with NA values too
>
>> mat.w<-which(mat==1, arr.ind=T)
>> tapply(mat.w[,2], mat.w[,1], min)
> 3 4 5
> 3 3 2
>
> You can then use modify such output as you have info about columns and
> rows. I am sure there are other maybe better options, e.g.
>
> lll<-as.list(as.data.frame(t(mat)))
>> unlist(lapply(lll, function(x) min(which(x==1))))
> V1 V2 V3 V4 V5
> Inf Inf 3 3 2
>
> Regards
> Petr
>
>>
>> first = c() # your extra variable which will eventually contain the
> first
>> correct response for each case
>>
>> for(i in 1:nrow(testmat)){
>>
>> c = 1
>>
>> while( c<=ncol(testmat) | testmat[i,c] != "1" ){
>>
>> if( testmat[i,c] == "1"){
>>
>> first[i] = c
>> break # will exit the while loop once it finds the first correct
>> answer,
> and
>> then jump to the next case
>>
>> } else {
>>
>> c=c+1 # procede to the next column if not
>>
>> }
>>
>> }
>>
>> }
>>
>>
>> Hope this helps you out a bit.
>>
>> Adrienne Wootten
>> NCSU
>>
>> On Fri, Oct 22, 2010 at 11:33 AM, David
>> Herzberg<davidh at wpspublish.com< mailto:davidh at wpspublish.com>> wrote:
>> Here's the problem I'm trying to solve in R: I have a data frame that
> consists
>> of about 1500 cases (rows) of data from kids who took a test of
> listening
>> comprehension. The columns are their scores (1 = correct, 0 =
>> incorrect,
> . =
>> missing) on 140 test items. The items are numbered sequentially and
>> are ordered by increasing difficulty as you go from left to right
>> across the
>
>> columns. I want R to go through the data and find the first correct
> response
>> for each case. Because of basal and ceiling rules, many cases have
> missing
>> data on many items before the first correct response appears.
>>
>> For each case, I want R to evaluate the item responses sequentially
> starting
>> with item 1. If the score is 0 or missing, proceed to the next item
>> and evaluate it. If the score is 1, stop the operation for that case,
>> record
> the
>> item number of that first correct response in a new variable, proceed
>> to
> the
>> next case, and restart the operation.
>>
>> In SPSS, this operation would be carried out with LOOP, VECTOR, and
>> DO
> IF, as
>> follows (assuming the data set is already loaded):
>>
>> * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT
>> RESPONSE, SET IT EQUAL TO 0.
>> numeric LCfirst1.
>> comp LCfirst1 = 0
>>
>> * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES.
>> vector x=LC1a_score to LC140a_score.
>>
>> * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0.
> "#i" IS
>> AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS.
>> loop #i=1 to 140 if (LCfirst1 = 0).
>>
>> * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH
>> ELEMENT
> OF
>> THE VE
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list