Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears.
For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains > 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will allow me to recreate the operation described above? How do I set up the indexing operation analogous to 'loop #i' in SPSS? Any help is appreciated, and I'm happy to provide more information if needed. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.