[R] elegant way to check if 2 values are in 3 columns?

jim holtman jholtman at gmail.com
Fri Aug 26 16:35:29 CEST 2011


You solution is not bad since it tries to make use of vectorized
operations, but there are some problems in it.

1) you should be using "&" instead of "&&" and "|" instead of "||"
(look at the help page to understand the difference.

2) FAQ 7.31:  You are checking against floating point values (... %in%
c(56.36,59.81)).  You need to use all.equal


mask <- with(vaslinks4, (yearsep < 1988) &
                   (all.equal(proc1, 56.36)
                    | all.equal(proc1, 59.81)
                    | all.equal(proc2, 56.36)
                    | all.equal(proc2, 59.81)
                    | all.equal(proc3, 56.36)
                    | all.equal(proc3, 59.81)
                    )
             )
vaslinks4$treat <- 0 # initialize
vaslinks4$treat[mask] <- 1

You can probably also come up with a solution using
"any(sapply(.....))", but it would not necessarily be any better than
the above.  The most important issue is FAQ 7.31.

On Fri, Aug 26, 2011 at 10:16 AM, Joanne Demmler
<J.Demmler at swansea.ac.uk> wrote:
> Dear all,
>
> I'm trying to rerun some data linkage exercises in R (they are designed to
> be done in SPSS, SAS or STATA)
> The exercise in question is to relabel the column "treat" to "1", if
> "yearsep" is smaller than 1988 and columns "proc1"-"proc3" contain the
> values 56.36 or 59.81.
>
> My pathetic solution to do this in R currently looks like this:
>
> vaslinks4$treat <- 0
>
> vaslinks4$treat[vaslinks4$yearsep < 1988 && (vaslinks4$proc1 %in%
> c(56.36,59.81)
>            || vaslinks4$proc2 %in% c(56.36,59.81)
>            || vaslinks4$proc3 %in% c(56.36,59.81))] <- 1
>
> But I'm sure there is a more elegant solution for this, in which I would not
> have to call all three columns separately.
>
> Anyone?
> Yours Joanne
>
>
>
> Solution in SPSS:
>
> COMPUTE treat=0.
> FORMATS treat (F1).
> DO REPEAT proc=proc1 to proc3.
> DO IF (yearsep LT 1988).
> IF (proc EQ 56.36 OR proc EQ 59.81) treat = 1.
> END IF.
> END REPEAT.
>
> Solution in SAS:
>
> do i = 1 to 3 until (treat > 0);
> if yearsep < 1988 then do;
> if procs{i} in (56.36, 59.81) then treat = 1;
> else treat = 0;
> end;
>
> Solution in STATA:
>
> generate treat=0
> foreach x in proc1 proc2 proc3 {
> recode treat(0=1) if ((`x'==56.36 | `x'==59.81) & yearsep<1988)
> | ((`x'>=63.70 &`x'<=63.79) & yearsep>=1988)
> }
> tab treat
>
>
> --
> Joanne Demmler Ph.D.
> Research Assistant
> College of Medicine
> Swansea University
> Singleton Park
> Swansea SA2 8PP
> UK
> tel:            +44 (0)1792 295674
> fax:            +44 (0)1792 513430
> email:          j.demmler at swansea.ac.uk
> DECIPHer:       www.decipher.uk.net
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list