# [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

Douglas Bates bates at stat.wisc.edu
Fri May 15 17:57:39 CEST 2009

```On Wed, May 13, 2009 at 5:21 PM,  <Avraham.Adler at guycarp.com> wrote:
>
> Hello.
>
> I am trying to optimize a set of parameters using /optim/ in which the
> actual function to be minimized contains matrix multiplication and is of
> the form:
>
> SUM ((A%*%X - B)^2)
>
> where A is a matrix and X and B are vectors, with X as parameter vector.

As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

> This has worked well so far. Recently, I was given a data set A of size
> 360440 x 1173, which could not be handled as a normal matrix. I brought it
> into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix
> package), and the formulæ and gradient work, but /optim/ returns an error
> of the form "no method for coercing this S4 class to a vector".

If you just want the least squares solution X then

X <- solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.

> After briefly looking into methods and classes, I realize I am in way over
> my head. Is there any way I could use /optim/ or another optimization
> algorithm, on sparse matrices?
>
> Thank you very much,
>