[R] subsetting tables
Petr PIKAL
petr.pikal at precheza.cz
Wed Sep 7 09:00:19 CEST 2011
Hi
>
> Hi Eik,
>
> greetings to Hamburg! :-) Thanks for the fast and helpful answer
>
>
> Eik Vettorazzi-2 wrote:
> >
> > #compare
> > str(red[,2])
> > str(red[2,])
> >
>
> I understand that the first is a real vector of nums in R and the second
is
> a ?? matrix/list/data.frame ?? of single ? entries? Can I
> transpose/transform it into one vector? Tried 'as.vector' but did not
help.
See
?"["
and its section about data.frame method, drop parameter
drop
logical. If TRUE the result is coerced to the lowest possible dimension.
The default is to drop if only one column is left, but not to drop if only
one row is left.
iris[1,]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
as.vector(unlist(iris[1,]))
[1] 5.1 3.5 1.4 0.2 1.0
But if your data are not all numeric they are coerced to numeric - see
last column values
>
>
> Eik Vettorazzi-2 wrote:
> >
> > sum(red>.5)
> > length(which(red>.5))
> >
>
> Sorry for being unprecise. Yes, in this case it was mainly the sum
(thanks!
> helpful function!), but in general I'd like to understand what happened
with
> subset here...
>
>
> Eik Vettorazzi-2 wrote:
> >
> >
> > and the arr.ind option of which may be useful as well.
> >
>
> Thanks a lot, very helpful. For other newbies, here is the line:
>
> tableReduced[,-1][which(tableReduced[,-1]>0.5, arr.ind=TRUE)]
>
> I needed to exclude the first column (-1) since these were titles
(factors)
> of my rows. In the first trial I forgot to add this information to the
first
> notion of the table as well, i.e., I tried:
>
> tableReduced[which(tableReduced[,-1]>0.5, arr.ind=TRUE)]
>
> This will (of course, I have to admit) result in subsetting fields that
are
> in one column to the left of the intended column. So, if there are any
> subsetting indices in the which-function, they also need to be put in
front
> of it to make the indices match.
>
> Just for my understanding, do you know what R did with here? Where do
the NA
> values come from, what is the row-title NA.1, why does it print the
first
> two rows unchanged and then goes crazy?
>
> > subset(red[,], red[,] > 0.5)
> > Allstar hsa.let.7a hsa.let.7a.1 hsa.let.7a.2
> > 2 0.87 0.79 -0.57 1.07
> > 3 0.67 -1.14 -0.78 -0.95
> > NA NA NA NA NA
> > NA.1 NA NA NA NA
> > NA.2 NA NA NA NA
>
it is rather unusual use of "[". I did not follow whole thread but with
subsetting you need to consider what you want to get from it.
> str(iris>6)
logi [1:150, 1:5] FALSE FALSE FALSE FALSE FALSE FALSE ...
Using comparison operator on data frame results in logical matrix which is
basically logical vector with dimensions.
> which(iris>6)
[1] 51 52 53 55 57 59 64 66 69 72 73 74 75 76 77 78 87
88 92
[20] 98 101 103 104 105 106 108 109 110 111 112 113 116 117 118 119 121
123 124
[39] 125 126 127 128 129 130 131 132 133 134 135 136 137 138 140 141 142
144 145
[58] 146 147 148 149 406 408 410 418 419 423 431 432 436
Iris has only 150 rows and you get correct indexing value from first
column but not from the others.
As you can see from
> tail(iris[which(iris>6),], 10)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
149 6.2 3.4 5.4 2.3 virginica
NA NA NA NA NA <NA>
NA.1 NA NA NA NA <NA>
NA.2 NA NA NA NA <NA>
NA.3 NA NA NA NA <NA>
NA.4 NA NA NA NA <NA>
NA.5 NA NA NA NA <NA>
NA.6 NA NA NA NA <NA>
NA.7 NA NA NA NA <NA>
NA.8 NA NA NA NA <NA>
you get NA values for those indices which are over 150 (no of iris rows).
If you want let say all items bigger than some threshold from data frame
you need some small hack
iris1 <- iris[,-5]
iris[ rowSums(iris1 > 6) > 0, ]
or
iris[ rowSums(iris > 6, na.rm=T) > 0, ]
Regards
Petr
> Thanks for this community with fast and reliable help. Amazing to see!
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/subsetting-
> tables-tp3793509p3794527.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list