[R] combinations and table selection problem
Carlos Guerra
carlosguerra.amb at gmail.com
Mon Mar 8 16:38:04 CET 2010
Dear all,
I have a table like this:
a <- read.csv("test.csv", header = TRUE, sep = ";")
a
UTM pUrb pUrb_class pAgri pAgri_class pNatFor pNatFor_class
1 NF1885 20.160307 NA 79.921386 NA 0.000000 NA
2 NF1886 51.965649 NA 46.657713 NA 0.000000 NA
3 NF1893 26.009581 NA 40.269204 NA 0.000000 NA
4 NF1894 3.141484 NA 0.000000 NA 0.000000 NA
5 NF1895 64.296826 NA 0.440691 NA 0.000000 NA
6 NF1896 14.174068 NA 25.613839 NA 0.000000 NA
7 NF1897 40.985589 NA 37.680521 NA 0.000000 NA
8 NF1898 34.054325 NA 66.027334 NA 0.000000 NA
9 NF1899 20.657632 NA 79.424024 NA 0.000000 NA
10 NF1982 94.857605 NA 45.368606 NA 0.000000 NA
...
And I executed the following code:
#data classification#
a$pUrb_class<-cut(a$pUrb, c(-Inf,80,Inf), labels = c(0,1))
a$pAgri_class<-cut(a$pAgri, c(-Inf,80,Inf), labels = c(0,1))
a$pNatFor_class<-cut(a$pNatFor, c(-Inf,80,Inf), labels = c(0,1))
a
UTM pUrb pUrb_class pAgri pAgri_class pNatFor pNatFor_class
1 NF1885 20.160307 0 79.921386 0 0.000000 0
2 NF1886 51.965649 0 46.657713 0 0.000000 0
3 NF1893 26.009581 0 40.269204 0 0.000000 0
4 NF1894 3.141484 0 0.000000 0 0.000000 0
5 NF1895 64.296826 0 0.440691 0 0.000000 0
6 NF1896 14.174068 0 25.613839 0 0.000000 0
7 NF1897 40.985589 0 37.680521 0 0.000000 0
8 NF1898 34.054325 0 66.027334 0 0.000000 0
9 NF1899 20.657632 0 79.424024 0 0.000000 0
10 NF1982 94.857605 1 45.368606 0 0.000000 0
...
#obtaining the number of combinations present in the data base#
library(survival)
b<-strata(a$pUrb_class,a$pAgri_class,a$pNatFor_class, sep=",")
table(b)
b
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=0
17698
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=1
112
a$pUrb_class=0,a$pAgri_class=1,a$pNatFor_class=0
4360
a$pUrb_class=1,a$pAgri_class=0,a$pNatFor_class=0
160
median(table(b))
[1] 2260
In this stage I have 3 questions:
1st:
how can I obtain the combinations witch are present over the median (in this case the first and the second combination)?
2nd:
how can I obtain the combinations witch are present over the median and have at least one condition present (in this case only the second combination)?
3rd:
how can I select/extract from the original table the rows witch comply with the 2nd question, in this case:
UTM pUrb pUrb_class pAgri pAgri_class pNatFor pNatFor_class
10 NF1982 94.857605 1 45.368606 0 0.000000 0
...
Thanks in advance,
Carlos Guerra
More information about the R-help
mailing list