---
title: "
Conic Representable"
date: "`r Sys.Date()`"
output:
bookdown::html_document2: default
bookdown::pdf_document2: default
bibliography: holiglm.bib
link-citations: yes
vignette: >
%\VignetteIndexEntry{Conic Representable}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 6,
fig.height = 6,
eval = TRUE,
error = TRUE
)
```
In this section we introduce a simple decision tree, designed to provide guidance
in answering the question whether a given problem is solvable via conic optimization.
Note, due to the complexity of the topic and the manifold of special cases, the decision tree
can only provide guidance and not provide a definitive answer.
For a general introduction into conic optimization see e.g.,
@Boyd+Vandenberghe:2004 or @LectureNotes:Tal+Nemirovski:2001.
A general way to determine if a given optimization problem is solvable
via conic optimization is illustrated in Figure \@ref(fig:conic). More specifically,
an optimization problem is solvable via conic optimization if it is convex and
if the objective and constraints can be represented using convex cones.
In all other cases, the problem is in principle not solvable, however exceptions exist.
Additional explanations to the decision tree are given in the subsection below.
```{r conic, echo = FALSE, fig.cap = cap_1, figure = TRUE}
cap_1 <- "Simple decision tree to evaluate if a given optimization problem can be solved with a conic optimization solver."
library("DiagrammeR")
mermaid('
%%{init: { "theme": "neutral" } }%%
graph TB
A["Can my optimization problem be solved via conic optimization?"] --> B{"Convex?"};
B -- yes --> C{"Representable?"};
B -- no --> D["Not solvable (Exceptions A)."];
C -- yes --> E["Solvable via conic optimization."];
C -- no --> F["Not solvable (Exceptions B)."];
click D "#ExceptionsA" "For more information see Section `Exceptions A`"
click F "#ExceptionsB" "For more information see Section `Exceptions B`"
')
```
Another rather straight-forward way to verify that a problem is solvable via conic optimization
is to enter the problem into one of the **CVX** [@cvx] domain specific language implementations.
**CVX** uses disciplined convex programming (DCP) [@Grant:DCP:2004] to verify that
a given problem is convex and conic representable. Note DCP works with the
rule: "if DCP compliant than solvable via convex solvers relation".
However, we encountered several problems which are known to be convex and representable
via conic optimization but not DCP compliant, e.g., log-binomial regression.
Regarding the **CVX** implementations,
[**cvxpy**](https://www.cvxpy.org/) [@Diamond:cvxpy:2016] for Python is the main focus of the
Stanford University Convex Optimization Group ([cvxgrp](https://github.com/cvxgrp)).
For **R** there exists the [**CVXR**](https://cvxr.rbind.io/) package [@pkg:CVXR].
# Representable
A problem is called conic representable if there exists a cone that can
express the problem. State of the art solvers distinguish between up to
eight different types of cones including linear, second-order, positive semidefinite,
power and exponential cones.
Therefore, if your problem is convex and comprised of linear,
quadratic, power, exponential or log terms there is a good chance that it
can be expressed via conic optimization. Nevertheless, note that this need not
be the case for any problem containing linear,
quadratic, power, exponential or log terms (an example is presented in \@ref(subsec:cloglog)).
The definition of the cones can be found in e.g., @Diamond:2015 or @roi:theussl:2020.
To provide more guidance on this topic we will discuss the representability
of several GLMs in Section \@ref(sec:Examples).
# Exceptions A {#ExceptionsA}
In general conic optimization solvers are designed to solve convex optimization problems.
However, for some non-convex problems there exist convex relaxations which make it
possible again to solve these problems via conic optimization.
## Mixed integer optimization
One such exception are mixed integer optimization problems, which are non-convex.
However, if the problem without
the mixed integer constraint can be solved via conic optimization also
the problem with mixed integer constraint can be solved via conic optimization.
This is possible since the sub-problems solved within the
branch and bound algorithm are again convex.
# Exceptions B {#ExceptionsB}
If the problem can not be expressed directly by the cones supported by the solvers,
there sometimes exist equivalent problems or approximations
which can be solved via conic optimization. An example for such an exception is
the binomial GLM with probit link (see \@ref(subsec:probit)).
# Examples {#sec:Examples}
## Binomial with logit-link (convex and representable)
It is well known that logistic regression is a convex optimization
problem, where the MLE is finite and unique when the data is not
separated. More information about separation detection can be found in
e.g., @detectseparation:logit.
The MLE can be obtained maximizing the log-likelihood function,
$$
\underset{\beta}{\text{maximize}}
\sum_{i \in I} X_{i*} \beta - \sum_{k \in I \cup J} \log(1 + \exp(X_{k*} \beta))
~~ \text{where} ~~ I = \{i| y_i = 1\}, J = \{j| y_j = 0\}.
$$
Knowing that the problem is convex and looking at the log-likelihood function
one sees that the objective is comprised sums of linear terms,
and logarithms and exponential functions. From this information
one can infer that there is a good chance that the problem is
representable by making use of the exponential cone. More information
on modeling problems with the exponential cone can be found in
@roi:theussl:2020, [**ROI**-homepage](https://roi.r-forge.r-project.org/)
and the [MOSEK Modeling Cookbook](https://docs.mosek.com/modeling-cookbook/index.html).
## Binomial with log-link (convex and representable)
The log-binomial regression model (LBRM) is known to be convex,
$$
\begin{array}{rl}
\underset{\beta}{\textrm{maximize}} &
\displaystyle\sum_{i = 1}^n y_i ~ X_{i*} \beta + (1 - y_i) ~ \log(1 - \exp(X_{i*} \beta)) \\
\textrm{subject to} &
X \beta \leq 0.
\end{array}
$$
similar to logistic regression the finiteness of the MLE depends
on the separation of the data. However, due to the linear constraint
$X \beta \leq 0$ the MLE of the log-binomial regression model
exists also for cases where the MLE of the logistic regression model
does not exist. @log-bin:schwendinger:2021 point out the existence
of the MLE of the log-binomial regression model can be verified by
solving a linear optimization problem.
Given the MLE exists, the LBRM can be solved by making use of the linear and exponential cone.
The linear cone is needed for the $X \beta \leq 0$ constraint.
## Binomial with probit-link (convex and non-representable) {#subsec:probit}
The MLE of the probit model can be obtained by,
$$
\underset{\beta}{\text{maximize}} ~~
\sum_{i = 1}^n y_i ~ \log(\Phi(X_{i*} \beta)) + \sum_{i = 1}^n (1 - y_i) \log(1 - \Phi(X_{i*} \beta)).
$$
Since there exist no cone to express the cumulative normal $\Phi$,
the problem can not be solved with conic solvers.
@Page1977Approx gives a simple approximation
$$
\Phi(x) = \frac{\exp \left( 2 \sqrt{\frac{\pi}{2}} ~ x \right)}{1 + \exp \left( 2 \sqrt{\frac{\pi}{2}} ~ x \right)}.
$$
Using this approximation the probit model can be estimated via logistic regression
where either the data or the estimate are re-scaled by $2 \sqrt{\frac{\pi}{2}}$.
There exist many approximations for the cumulative normal and most of them are
more accurate than this approximation. However, this approximation has the
advantage that it can be expressed via conic optimization.
## Binomial with cloglog-link (convex and non-representable) {#subsec:cloglog}
The complementary log-log (cloglog) model is known to be convex and
the MLE is finite and unique if the data is not separated
(see e.g., @mleExistence:Silvapulle:1981 or @mleExistence:Kaufmann:1988).
$$
\underset{\beta}{\text{maximize}} ~~
\sum^n_{i = 1} y_i ~ \log(1 - \exp(-\exp(X_{i*} \beta))) - (1 - y_i) ~ \exp(X_{i*} \beta)
$$
Since the problem is convex and the log-likelihood is only comprised of
$\log$ and $\exp$ terms, the problem could be representable via the exponential
cone. However, we did not find a problem formulation that worked as expected.
We speculate
that since we have to put an exponential cone into another exponential cone
the problem is badly scaled and therefore the solution gets unreliable.
# References