\documentclass{article} % \usepackage{myVignette} \usepackage[authoryear,round]{natbib} \bibliographystyle{plainnat} \newcommand{\noFootnote}[1]{{\small (\textit{#1})}} \newcommand{\myOp}[1]{{$\left\langle\ensuremath{#1}\right\rangle$}} %% vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv %%\VignetteIndexEntry{Design Issues in Matrix package Development} %%\VignetteDepends{Matrix,utils} \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=true,keep.source=TRUE} % ^^^^^^^^^^^^^^^^ \title{Design Issues in Matrix package Development} \author{Martin Maechler and Douglas Bates\\R Core Development Team \\\email{maechler@stat.math.ethz.ch}, \email{bates@r-project.org}} \date{Spring 2008; Aug~2022 ({\tiny typeset on \tiny\today})} % \begin{document} \maketitle \begin{abstract} This is a (\textbf{currently very incomplete}) write-up of the many smaller and larger design decisions we have made in organizing functionalities in the Matrix package. Classes: There's a rich hierarchy of matrix classes, which you can visualize as a set of trees whose inner (and ``upper'') nodes are \emph{virtual} classes and only the leaves are non-virtual ``actual'' classes. Functions and Methods: - setAs() - others \end{abstract} %% Note: These are explained in '?RweaveLatex' : <>= options(width=75) library(utils) # for R_DEFAULT_PACKAGES=NULL library(Matrix) @ \section{The Matrix class structures} \label{sec:classes} Take Martin's DSC 2007 talk to depict the Matrix class hierarchy; available from {\small \url{https://stat.ethz.ch/~maechler/R/DSC-2007_MatrixClassHierarchies.pdf}} . % ~/R/Meetings-Kurse-etc/2007-DSC/talk.tex Matrix-classes.Rnw --- --- --- %% \hrule[1pt]{\textwidth} From far, there are \textbf{three} separate class hierarchies, and every \pkg{Matrix} package matrix has an actual (or ``factual'') class inside these three hierarchies: % ~/R/Meetings-Kurse-etc/2007-DSC/Matrix-classes.Rnw More formally, we have three (\ 3 \ ) main ``class classifications'' for our Matrices, i.e.,\\ three ``orthogonal'' partitions of ``Matrix space'', and every Matrix object's class corresponds to an \emph{intersection} of these three partitions; i.e., in R's S4 class system: We have three independent inheritance schemes for every Matrix, and each such Matrix class is simply defined to \texttt{contain} three \emph{virtual} classes (one from each partitioning scheme), e.g, The three partioning schemes are \begin{enumerate} \item Content \texttt{type}: Classes \code{dMatrix}, \code{lMatrix}, \code{nMatrix}, (\code{iMatrix}, \code{zMatrix}) for entries of type \textbf{d}ouble, \textbf{l}ogical, patter\textbf{n} (and not yet \textbf{i}nteger and complex) Matrices. \code{nMatrix} only stores the \emph{location} of non-zero matrix entries (where as logical Matrices can also have \code{NA} entries!) \item structure: general, triangular, symmetric, diagonal Matrices \item sparsity: \code{denseMatrix}, \code{sparseMatrix} \end{enumerate} For example in the most used sparseMatrix class, \code{"dgCMatrix"}, the three initial letters \code{dgC} each codes for one of the three hierarchies: \begin{description} \item{d: } \textbf{d}ouble \item{g: } \textbf{g}eneral \item{C: } \textbf{C}sparseMatrix, where \textbf{C} is for \textbf{C}olumn-compressed. \end{description} Part of this is visible from printing \code{getClass("\emph{}")}: <>= getClass("dgCMatrix") @ Another example is the \code{"nsTMatrix"} class, where \code{nsT} stands for \begin{description} \item{n: } \textbf{n} is for ``patter\textbf{n}'', boolean content where only the \emph{locations} of the non-zeros need to be stored. \item{t: } \textbf{t}riangular matrix; either \textbf{U}pper, or \textbf{L}ower. \item{T: } \textbf{T}sparseMatrix, where \textbf{T} is for \textbf{T}riplet, the simplest but least efficient way to store a sparse matrix. \end{description} From R itself, via \code{getClass(.)}: <>= getClass("ntTMatrix") @ \subsection{Diagonal Matrices} \label{ssec:diagMat} The class of diagonal matrices is worth mentioning for several reasons. First, we have wanted such a class, because \emph{multiplication} methods are particularly simple with diagonal matrices. The typical constructor is \Rfun{Diagonal} whereas the accessor (as for traditional matrices), \Rfun{diag} simply returns the \emph{vector} of diagonal entries: <>= (D4 <- Diagonal(4, 10*(1:4))) str(D4) diag(D4) @ We can \emph{modify} the diagonal in the traditional way (via method definition for \Rfun{diag<-}): <>= diag(D4) <- diag(D4) + 1:4 D4 @ Note that \textbf{unit-diagonal} matrices (the identity matrices of linear algebra) with slot \code{diag = "U"} can have an empty \code{x} slot, very analogously to the unit-diagonal triangular matrices: <>= str(I3 <- Diagonal(3)) ## empty 'x' slot getClass("diagonalMatrix") ## extending "sparseMatrix" @ Originally, we had implemented diagonal matrices as \emph{dense} rather than sparse matrices. After several years it became clear that this had not been helpful really both from a user and programmer point of view. So now, indeed the \code{"diagonalMatrix"} class does also extend \code{"sparseMatrix"}, i.e., is a subclass of it. However, we do \emph{not} store explicitly where the non-zero entries are, and the class does \emph{not} extend any of the typical sparse matrix classes, \code{"CsparseMatrix"}, \code{"TsparseMatrix"}, or \code{"RsparseMatrix"}. Rather, the \code{diag()}onal (vector) is the basic part of such a matrix, and this is simply the \code{x} slot unless the \code{diag} slot is \code{"U"}, the unit-diagonal case, which is the identity matrix. Further note, e.g., from the \code{?$\,$Diagonal} help page, that we provide (low level) utility function \code{.sparseDiagonal()} with wrappers \code{.symDiagonal()} and \code{.trDiagonal()} which will provide diagonal matrices inheriting from \code{"CsparseMatrix"} which may be advantageous in \emph{some cases}, but less efficient in others, see the help page. \section{Matrix Transformations} \label{sec:trafos} \subsection{Coercions between Matrix classes} \label{ssec:coerce} You may need to transform Matrix objects into specific shape (triangular, symmetric), content type (double, logical, \dots) or storage structure (dense or sparse). Every useR should use \code{as(x, )} to this end, where \code{} is a \emph{virtual} Matrix super class, such as \code{"triangularMatrix"} \code{"dMatrix"}, or \code{"sparseMatrix"}. In other words, the user should \emph{not} coerce directly to a specific desired class such as \code{"dtCMatrix"}, even though that may occasionally work as well. Here is a set of rules to which the Matrix developers and the users should typically adhere: \begin{description} \item[Rule~1]: \code{as(M, "matrix")} should work for \textbf{all} Matrix objects \code{M}. \item[Rule~2]: \code{Matrix(x)} should also work for matrix like objects \code{x} and always return a ``classed'' Matrix. Applied to a \code{"matrix"} object \code{m}, \code{M. <- Matrix(m)} can be considered a kind of inverse of \code{m <- as(M, "matrix")}. For sparse matrices however, \code{M.} well be a \code{CsparseMatrix}, and it is often ``more structured'' than \code{M}, e.g., <>= (M <- spMatrix(4,4, i=1:4, j=c(3:1,4), x=c(4,1,4,8))) # dgTMatrix m <- as(M, "matrix") (M. <- Matrix(m)) # dsCMatrix (i.e. *symmetric*) @ \item[Rule~3]: All the following coercions to \emph{virtual} matrix classes should work:\\ \begin{enumerate} \item \code{as(m, "dMatrix")} \item \code{as(m, "lMatrix")} \item \code{as(m, "nMatrix")} \item \code{as(m, "denseMatrix")} \item \code{as(m, "sparseMatrix")} \item \code{as(m, "generalMatrix")} \end{enumerate} whereas the next ones should work under some assumptions: \begin{enumerate} \item \code{as(m1, "triangularMatrix")} \\ should work when \code{m1} is a triangular matrix, i.e. the upper or lower triangle of \code{m1} contains only zeros. \item \code{as(m2, "symmetricMatrix")} should work when \code{m2} is a symmetric matrix in the sense of \code{isSymmetric(m2)} returning \code{TRUE}. Note that this is typically equivalent to something like \code{isTRUE(all.equal(m2, t(m2)))}, i.e., the lower and upper triangle of the matrix have to be equal \emph{up to small numeric fuzz}. \end{enumerate} \end{description} \section{Session Info} <>= toLatex(sessionInfo()) @ %not yet %\bibliography{Matrix} \end{document}