[R] generate list of variable names
Jon Erik Ween
jween at klaru-baycrest.on.ca
Wed Jun 9 19:02:51 CEST 2010
Thanks Erik
I can't figure out how to use the various x_apply functions in this setting, nor post datasets to reproduce. But anyhow: the table structure is something like this:
id (integer), handedness(R,L,A), gender(M,F), cat1(patient, control). cat2(stroke, MS, dement, control), accuracy(integer), reaction time(numeric)....
so, I want to extract the factor levels from cat1, cat2, etc and run, say, ANOVAs or ROCs on each of the response variables (accuracy, reaction_time, etc) extracting F-values, AUCs, etc, sticking the results in a table of results. Here is an example script I wrote for ROCR:
#######
library(ROCR) # Load stats package to use if not standard
varslist<-scan("/Users/jween/Desktop/INCAS/INCASvars.txt","list") # Read variable list
results<-as.data.frame(array(,c(3,length(varslist)))) # Initialize results array, one type of stat at a time for now
for (i in 1:length(varslist)){ # Loop through the variables you want to process. Determined by varslist
j<-noquote(varslist[i])
vars<-c(varslist[i],"Issue_class") # Variables to be analyzed
temp<-na.omit(MSsmv[vars]) # Have to subset to get rid of NA values causing ROCR to choke
n<-nrow(temp) # Record how many cases the analysis ios based on. Need to figure out how to calc cases/controls
#.table<-table(temp$SubjClass) # Maybe for later figure out cases/controls
results[1,i]<-j # Name particular results column
results[2,i]<-n # Number of subjects in analysis
test<-try(aucval(i,j),silent=TRUE) # Error handling in case procedure craps oust so loop can continue. Supress annoying error messages
if(class(test)=="try-error") next else # Run procedure only if OK, otherwise skip
pred<-prediction(MSsmv[[j]], MSsmv$Issue_cat); # Procedure
perf<-performance(pred,"auc");
results[3,i]<-as.numeric(perf at y.values) # Enter result into appropriate row
}
write.table(results,"/Users/jween/Desktop/IncasRres_MSsmv.csv",sep=",",col.names=FALSE,row.names=FALSE) # Write results to table
rm(aucval,i,n,temp,vars,results,pred,perf,j,varslist) # Clean up test,
aucval<-function(i,j){ # Function to trap errors. Should be the same as real procedure above
pred<-prediction(MSsmv[[j]], MSsmv$Issue_cat); # Don't put any real results here, they don't seem to be passed back
perf<-performance(pred,"auc");
}
#######
Cheers
Jon
Soli Deo Gloria
Jon Erik Ween, MD, MS
Scientist, Kunin-Lunenfeld Applied Research Unit
Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre
Assistant Professor, Dept. of Medicine, Div. of Neurology
University of Toronto Faculty of Medicine
Kimel Family Building, 6th Floor, Room 644
Baycrest Centre
3560 Bathurst Street
Toronto, Ontario M6A 2E1
Canada
Phone: 416-785-2500 x3648
Fax: 416-785-2484
Email: jween at klaru-baycrest.on.ca
Confidential: This communication and any attachment(s) may contain confidential or privileged information and is intended solely for the address(es) or the entity representing the recipient(s). If you have received this information in error, you are hereby advised to destroy the document and any attachment(s), make no copies of same and inform the sender immediately of the error. Any unauthorized use or disclosure of this information is strictly prohibited.
On 2010-06-09, at 12:20 PM, Erik Iverson wrote:
>
>
> Jon Erik Ween wrote:
>> Hi!
>> Would anyone know how to generate a list of variable names from a
>> data frame by the class of the variable?
>
> a start...
>
> df <- data.frame(f1 = factor(1:10),
> f2 = factor(1:10),
> n1 = 1:10,
> n2 = 1:10)
>
>
> sapply(df, class)
>
>> I have large tables with different numbers of columns and am trying
>> to script some rote analyses. There are several categorizing
>> variables (factors) and many response variables (integers and
>> numeric). I want to extract a list of classifier column names in one
>> list and response variable names in another list, then run for-loops
>> to calculate various statistics on the response variables in terms of
>> the classifier variables. I thought something like this might work
>> (but didn't):
>
> Reproducible example needed. All this can surely be done more elegantly with lapply/mapply instead of for-loops.
More information about the R-help
mailing list