-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathuse_case_LAD.Rmd
99 lines (82 loc) · 3.28 KB
/
use_case_LAD.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
title: "Least absolute deviation (LAD) problem"
---
The [least absolute deviation (LAD)](https://en.wikipedia.org/wiki/Least_absolute_deviations) ---
also known as $L_1$ regression --- is a statistical optimization technique
attempting to find a function $f(x_i)$ which closely approximates a data
set of the points ($x_i$, $y_i$) with $i = 1, \ldots{}, n$. It minimizes the sum of absolute errors between points generated
by the function ($\hat{y}_i$) and the corresponding data points.
\begin{eqnarray*}
\text{minimize} ~~ \sum_i^n | y_i - \hat{y}_i |
\end{eqnarray*}
As such can be classified as a non-linear optimization problem.
However, following, e.g., [Brooks and Dula (2013)](#BD2013) the
objective function can be expressed as
$$
\begin{eqnarray*}
\underset{{\beta_0,\mathbf{\beta},\mathbf{e}^+,\mathbf{e}^-}}{\text{minimize}}
~~ \sum_{i=1}^n e_i^+ + e_i^- ~~~~~~~~~~~~~~~~~
\nonumber
\\
\text{subject to} ~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~
\nonumber
\\
\beta_0 + \mathbf{\beta}^\top \mathbf{x}_i + e_i^+ - e_i^- = 0 ~~~~~~ i = 1,\ldots{},n
\nonumber
\\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \beta_j = -1 ~~~~~~~~~~~~~~~~~~~~~
\nonumber
\\
~~~~~~~~~~~~~~~~~~~~~~~ e_i^+, e_i^- \geq 0 ~~~~~ i = 1,\ldots{},n
\end{eqnarray*}
$$
given a set of points $\mathbf{x}_i \in \mathbb{R}^m$, $i = 1,\ldots{},n$
and the $j^{th}$ column representing the dependent variable $y$. In
comparison to the original formulation we differentiate between positive
and negative deviations/errors ($e_i^+$ and $e_i^-$, respectively) to
get to the linear programming problem shown above.
</br>
## LP Implementation
The linear programming formulation of the $L_1$ regression problem as shown above
can be constructed using **ROI** methods via the following R function.
```{r, message = FALSE}
create_L1_problem <- function(x, j) {
m <- ncol(x) + 1L
n <- 2 * nrow(x)
beta <- double(m + n)
beta[j + 1] <- 1
OP(objective = L_objective(c(rep(0, m), rep(1, n))),
constraints = rbind(
L_constraint(L = cbind(1, x, diag(nrow(x)), -diag(nrow(x))),
dir = eq(nrow(x)), rhs = rep(0, nrow(x))),
L_constraint(L = beta, dir = "==", rhs = -1)),
bounds = V_bound(li = seq_len(m), lb = rep(-Inf, m),
nobj = length(beta)))
}
```
With **ROI** one can solve e.g., Brownlee's stack loss plant data example from the **stats**
package using the above OP and the solver GLPK as follows.
```{r use_case_LAD_glpk, message = FALSE}
Sys.setenv(ROI_LOAD_PLUGINS = FALSE)
library(ROI)
library(ROI.plugin.glpk)
data(stackloss)
l1p <- create_L1_problem(as.matrix(stackloss), 4)
L1_res <- ROI_solve(l1p, solver = "glpk")
```
The first value corresponds to the intercept and the others to the
model coefficients.
## Comparison
For comparison purposes we solve the same problem with **R**
package [**quantreg**](https://cran.r-project.org/package=quantreg).
```{r use_case_LAD_quantreg, message = FALSE}
library("quantreg")
fm <- rq(stack.loss ~ stack.x, 0.5)
```
```{r}
n <- ncol(stackloss)
cbind(ROI = head(solution(L1_res), n),
quantreg = coef(fm))
```
# References
* Brooks JP, Dula JH (2013). The L1-norm best-fit hyperplane problem. Applied Mathematics Letters, 26(1), 51–55. URL [`doi:10.1016/j.aml.2012.03.031`](https://doi.org/10.1016/j.aml.2012.03.031) <a name = "BD2013"></a>