[R] Subsetting problem data

Rui Barradas ruipbarradas at sapo.pt
Thu Jul 19 18:56:57 CEST 2012


Hello,

Try the following.


d <- read.csv(text="
Patient, Cycle, Variable1, Variable2
A, 1, 4, 5
A, 2, 3, 3
A, 3, 4, NA
B, 1, 6, 6
B, 2, NA, 6
C, 1, 6, 5
C, 3, 2, 2
", header=TRUE)
d

compl <- lapply(split(d, d$Patient), function(x) if(all(diff(x$Cycle) == 
1)) x)
holes <- lapply(split(d, d$Patient), function(x) if(any(diff(x$Cycle) != 
1)) x)

do.call(rbind, compl)
do.call(rbind, holes)

In the mean time, you have posted another question similar but 
apparently more complete. I'll see to it, but tell something, is this 
answer completely off? If you just want to know whether there are holes, 
TRUE/FALSE answers, this other version might do it.

aggregate(Cycle ~ Patient, data=d, function(x) any(diff(x) != 1))

Hope this helps,

Rui Barradas
Em 18-07-2012 20:30, Lib Gray escreveu:
> Hello, I need to subset my data to only look at the parts that have "holes"
> in it. I already have a formula to get rid of inconsistencies, but now I
> need to look only at the problem data to reconfigure it. In my data set
> where there are multiple "cycles" per "patient," and I want to highlight
> the patients who have a variable was not measured every cycle.
>
> Here's a similar example of the data:
>
> Patient, Cycle, Variable1, Variable 2
> A, 1, 4, 5
> A, 2, 3, 3
> A, 3, 4, NA
> B, 1, 6, 6
> B, 2, NA, 6
> C, 1, 6, 5
> C, 3, 2, 2
>
> So in this case, I would want Patient A and Patient B, but not Patient C.
>
> Thanks!
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list