[R] create multiple categorical variables in a data frame using a loop
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Thu Apr 19 22:22:28 CEST 2018
> On Apr 19, 2018, at 11:20 AM, Ding, Yuan Chun <ycding using coh.org> wrote:
>
> Hi All,
>
> I want to create a categorical variable, cat.pfoa, in the file of pfas.pheno (a data frame) based on log2pfoa values. I can do it using the following code.
>
> pfas.pheno <-within(pfas.pheno, {cat.pfoa<-NA
> cat.pfoa[pfas.pheno$log2pfoa <=quantile(pfas.pheno$log2pfoa,0.25, na.rm =T)]<-0
> cat.pfoa[pfas.pheno$log2pfoa >=quantile(pfas.pheno$log2pfoa,0.75, na.rm =T)]<-2
> cat.pfoa[pfas.pheno$log2pfoa >=quantile(pfas.pheno$log2pfoa,0.25, na.rm =T)
> &pfas.pheno$log2pfoa <=quantile(pfas.pheno$log2pfoa,0.75, na.rm =T)]<-1
> }
This would be somewhat more compact and easier to maintain if you used findInterval (untested in the absence of a data object, which is your responsibility):
pfas.pheno <-within(pfas.pheno, {
cat.pfoa <- findInterval( log2pfoa , c(-Inf, quantile( log2pfoa,c(.25,.75), Inf), na.rm =T), Inf)]-1 } )
`findInterval` numbers its intervals from 1, so to get a sequence starting at 0 just subtract 1.
> However, I have additional 7 similar variables, so I wrote the following code, but it does not work.
>
> for (i in c("log2pfoa","log2pfos", "log2pfna", "log2pfdea", "log2pfuda", "log2pfhxs", "log2et_pfosa_acoh", "log2me_pfosa_acoh")) {
> cat.var <- paste0("cat.",i)
> pfas.pheno <- within(pfas.pheno, {eval(parse(text= cat.var))<-NA
Nope. Cannot use R like a macro processor, at least not easily. R names are not the same as character vlaues. They "live in different realities". The `get` and `assign` functions can be used to "promote" character values to real R names and make assignments from and to what would otherwise be merely character values.
Perhaps this (also mostly untested (except for the strategy of making `assign` creat a new dataframe column:
for (i in c("log2pfoa","log2pfos", "log2pfna", "log2pfdea", "log2pfuda", "log2pfhxs", "log2et_pfosa_acoh",
"log2me_pfosa_acoh")) {
cat.var <- paste0("cat.",i)
assign( cat.var, findInterval( get(i) , c(-Inf, quantile( get(i), c(.25,.75), Inf), na.rm =T), Inf)]-1 } ),
envir=as.environment( get( pfas.pheno ) ) )
Best;
David.
> eval(parse(text=cat.var))[pfas.pheno[,i] <= quantile(pfas.pheno[,i],0.25, na.rm =T)] <- 0
> eval(parse(text=cat.var))[pfas.pheno[,i] >= quantile(pfas.pheno[,i],0.75, na.rm =T)] <- 2
> eval(parse(text=cat.var))[pfas.pheno[,i] >= quantile(pfas.pheno[,i],0.25, na.rm =T)
> &pfas.pheno[,i] <= quantile(pfas.pheno[,i],0.75, na.rm =T)] < -1
> })
> }
>
> Can you help me fix the problem?
>
> Thank you,
>
> Yuan Chun Ding
> City of Hope National Medical Center
>
>
>
> ---------------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
> This message (and any attachments) are intended solely...{{dropped:20}}
More information about the R-help
mailing list