[R] Removing Rows/Records from a Table
Peter Lauren
peterdlauren at yahoo.com
Mon Apr 17 22:23:31 CEST 2006
--- Marc Schwartz <MSchwartz at mn.rr.com> wrote:
> On Sat, 2006-04-15 at 08:19 -0700, Peter Lauren
> wrote:
> > I would like to selectively remove rows from a
> table.
> > I had hoped that I could create a table and
> > selectively add rows with something like
> > > NewTable<-table(nrow=100, ncol=4)
> > > NewTable[1,]<-OldTable[10,]
> >
> > but that doesn't work. The former call gives
> > > NewTable
> > ncol
> > nrow 4
> > 100 1
> > while the latter call gives a table the length of
> > OldTable. Making a matrix, m, with the desired
> > table entries and doing
> > >NewTable-table(m)
> > also doesn't work.
> >
> > Can anyone suggest the best way for me to do what
> I
> > want to do?
> >
> > Many thanks in advance,
> > Peter Lauren.
>
> First, I think that we need to clarify terminology,
> as you seem to be
> mixing tables and matrices (or at least, the
> intention of the table()
> function). See ?table and ?matrix.
>
> The table() function creates and returns a
> contingency table based upon
> the [cross-]tabulation of one or more objects, such
> as vectors, factors
> or lists. The contingency table interprets these
> objects as factors,
> generating the frequency counts of each combination
> of the factor
> levels. See ?factor for more information.
>
> So...for example, we can generate a table of the
> counts of the possibly
> repeating unique elements in a single vector:
>
> set.seed(1)
> vec <- sample(letters[1:4], 10, replace = TRUE)
>
> > vec
> [1] "b" "b" "c" "d" "a" "d" "d" "c" "c" "a"
>
> > table(vec)
> vec
> a b c d
> 2 2 3 3
>
>
> Or...we can generate a 2d contingency table of the
> cross-tabulation of
> two vectors:
>
> set.seed(2)
> vec2 <- sample(LETTERS[1:4], 10, replace = TRUE)
>
> > vec2
> [1] "A" "C" "C" "A" "D" "D" "A" "D" "B" "C"
>
> > table(vec, vec2)
> vec2
> vec A B C D
> a 0 0 1 1
> b 1 0 1 0
> c 0 1 1 1
> d 2 0 0 1
>
> So, here we have the result of the combinations of
> letters found in the
> two vectors, based upon in effect pairing the two
> vectors in order. It
> may be easier to visualize them together in this
> fashion (see ?rbind):
>
> > rbind(vec, vec2)
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
> [,10]
> vec "b" "b" "c" "d" "a" "d" "d" "c" "c"
> "a"
> vec2 "A" "C" "C" "A" "D" "D" "A" "D" "B"
> "C"
>
> Note for example, that there are 2 occurrences of
> 'd' paired with 'A' in
> columns 4 and 7, which is reflected in the lower
> left hand corner of the
> table above.
>
> The table() function does not just create an
> n-dimensional matrix but
> actually manipulates the data passed to it to create
> the counts in the
> resultant contingency table.
>
> Note that in the case of the second example, the
> result is an object of
> class 'table', which is in essence, a 2d integer
> matrix of the counts,
> with additional attributes (see ?str for more
> information):
>
> str(table(vec, vec2))
> int [1:4, 1:4] 0 1 0 2 0 0 1 0 1 1 ... # This is
> the result matrix
> - attr(*, "dimnames")=List of 2 # These
> are the row/col names
> ..$ vec : chr [1:4] "a" "b" "c" "d"
> ..$ vec2: chr [1:4] "A" "B" "C" "D"
> - attr(*, "class")= chr "table" # This
> shows the object class
>
>
>
> Now, let's contrast that process with the creation
> of an integer
> matrix.
>
> vec3 <- 1:10
>
> > vec3
> [1] 1 2 3 4 5 6 7 8 9 10
>
> > matrix(vec3, ncol = 2, nrow = 5)
> [,1] [,2]
> [1,] 1 6
> [2,] 2 7
> [3,] 3 8
> [4,] 4 9
> [5,] 5 10
>
> Note that we have taken the 1d vector and converted
> it to a 2d matrix
> with 2 columns and 5 rows. There is no manipulation
> of the data, simply
> a restructuring of the object. By default, the
> matrix is created in
> column order. We can change the order of creation by
> using 'byrow':
>
> > matrix(vec3, ncol = 2, nrow = 5, byrow = TRUE)
> [,1] [,2]
> [1,] 1 2
> [2,] 3 4
> [3,] 5 6
> [4,] 7 8
> [5,] 9 10
>
> And...as a quick short cut, we can also do this,
> which yields the same
> result as the first use of matrix() above:
>
> dim(vec3) <- c(5, 2)
>
> > vec3
> [,1] [,2]
> [1,] 1 6
> [2,] 2 7
> [3,] 3 8
> [4,] 4 9
> [5,] 5 10
>
> This simply shows that a matrix is a vector with a
> 'dim' attribute.
>
>
> Now, back to the original question which is the
> removal (or could be
> adding) of rows (or columns) to a matrix, whether
> the result of a
> matrix() type operation or the result of using the
> table() function.
>
> Let's take the result of the table operation in the
> second example:
>
> tab <- table(vec, vec2)
>
> > tab
> vec2
> vec A B C D
> a 0 0 1 1
> b 1 0 1 0
> c 0 1 1 1
> d 2 0 0 1
>
> Now, we want to remove the third row:
>
> > tab[-3, ]
> vec2
> vec A B C D
> a 0 0 1 1
> b 1 0 1 0
> d 2 0 0 1
>
> The same syntax can be used on the integer matrix we
> created above:
>
> mat <- matrix(vec3, ncol = 2, nrow = 5)
>
> > mat
> [,1] [,2]
> [1,] 1 6
> [2,] 2 7
> [3,] 3 8
> [4,] 4 9
> [5,] 5 10
>
> > mat[-3, ]
> [,1] [,2]
> [1,] 1 6
> [2,] 2 7
> [3,] 4 9
> [4,] 5 10
>
>
> So, in both cases, we can manipulate the resultant
> object by using
> standard object indexing. See ?Extract for more
> information.
>
> The key is that the table() function does not just
> create a matrix (in
> the case of two or more objects being passed), but
> that it actually
> manipulates those objects internally to create a
> contingency table.
>
> Thus, the result of your first example:
>
> NewTable <- table(nrow = 100, ncol = 4)
>
> is the creation of a table, based upon passing two
> objects:
>
> nrow <- 100
> ncol <- 4
>
> resulting in:
>
> > NewTable
> ncol
> nrow 4
> 100 1
>
> showing that there is 1 occurrence of the
> combination of 100 with 4.
>
> The result is _not_ a matrix with 100 rows and 4
> columns.
>
>
> The matrix() function restructures the object passed
> to it, without
> manipulating the object's elements.
>
> You can also add rows and/or columns to a matrix by
> using the rbind()
> and cbind() functions, respectively. See ?rbind,
> which will bring up the
> help for both functions.
>
> HTH,
>
> Marc Schwartz
>
Dear Dr. Schwartz
Thank you very much for your very extensive help and
also for your very fast reply. I certainly learned a
lot from reading your message. I was, in fact, using
the table as if it were simply a 2D matrix. What I
really wanted to use was a matrix and I have
restructured my code accordingly.
Thanks again,
Peter Lauren.
More information about the R-help
mailing list