[R] How to subset my dataframe? (a bit tricky)
markleeds at verizon.net
markleeds at verizon.net
Tue Jun 16 22:24:47 CEST 2009
Hi Bill: I was trying to do below myself but was having problems. So I took
your solution and made another one. yours was working
a little weirdly because I don't think the person wants to keep rows where
there are 2 dnv's in a row and he/she also wanted to keep
the row if the second column has a "dnv". So, below is essentially
plagiarism with a minor fix. Thanks.
DF[unique(unlist(sapply(3:ncol(DF),function(.col) {
   keeprow <- which(( d[,.col]=="dnv" & d[,.col-1]!="0" & d[,.col-1] !=
"dnv") | (d[,2] == "dnv"))
}))),]
On Jun 16, 2009, William Dunlap <wdunlap at tibco.com> wrote:
> -----Original Message-----
> From: [1]r-help-bounces at r-project.org
> [mailto:[2]r-help-bounces at r-project.org] On Behalf Of Mark Na
> Sent: Tuesday, June 16, 2009 11:27 AM
> To: [3]r-help at r-project.org
> Subject: [R] How to subset my dataframe? (a bit tricky)
>
> Hi R-helpers,
>
> I would like to subset my dataframe, keeping only those rows which
> satisfy the following conditions:
>
> 1) the string "dnv" is found in at least one column;
> 2) the value in the column previous to the one "dnv" is found
> in is not "0"
Suppose your data.frame is called 'd'. Then try looping over
its columns:
keep <- rep(FALSE, nrow(d))
if (ncol(d)>2) for(i in 3:ncol(d)) keep <- keep | ( d[,i]=="drv" &
d[,i-1]!="0")
so
d[keep,]
is the subset you want.
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
>
> Here's what my data look like:
>
> Â Â Â POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
>
> 4    101    0.15      0     dnv   Â
 dnv     dnv
> 7    102      0     dnv     dnv   Â
 dnv     dnv
> 87    103    0.15     dnv      1    Â
 1      1
> 99    104     dnv    0.25      1    Â
 1    0.75
>
> So, for above example, the new dataframe would not contain POND_ID 101
> or 102 (because there is a 0 before the dnv) but it WOULD contain
> POND_ID 103 (because there is a 0.15 before the dnv) and 104 (because
> dnv occurs in the first column, so cannot be preceded by a 0).
>
> One extra twist: I would like to retain rows in the new dataframe
> which satisfy the above conditions even if they also have a "0" then
> "dnv" sequence preceding or following the "problem" , e.g., the
> following rows would be retained in the new dataframe
>
> Â Â POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04
>
> 100   105    0.15     dnv      1   Â
  0     dnv
> 101   106    0       dnv      1   Â
  0.15   dnv
>
> Thanks in advance for any help you might provide.
>
> (I hope I've provided enough of an example; I could also provide a
> .csv file if that would help.)
>
> Mark Na
>
> ______________________________________________
> [4]R-help at r-project.org mailing list
> [5]https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> [6]http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________
[7]R-help at r-project.org mailing list
[8]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
[9]http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
References
1. mailto:r-help-bounces at r-project.org
2. mailto:r-help-bounces at r-project.org
3. mailto:r-help at r-project.org
4. mailto:R-help at r-project.org
5. https://stat.ethz.ch/mailman/listinfo/r-help
6. http://www.R-project.org/posting-guide.html
7. mailto:R-help at r-project.org
8. https://stat.ethz.ch/mailman/listinfo/r-help
9. http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list