Anatomy of a POLO Program

Programs that use polo are written in a single C++ file, as shown in Listing 1.

Note

All the code samples presented in this documentation are provided under docs/examples directory in the source tree. Listing captions give the relative path to the sample file. For example, getting-started/anatomy.cpp refers to anatomy.cpp file under docs/examples/getting-started.

Listing 1 getting-started/anatomy.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* include system libraries */
#include <iostream>
#include <vector>
using namespace std;

/* include polo */
#include <polo/polo.hpp>
using namespace polo;

int main(int argc, char *argv[]) {
  /* define the problem data */
  matrix::dmatrix<double, int> A(3, 3, {1, 0, 0, 0, 1, 0, 0, 0, 1});
  vector<double> b{-1, 1, 1};
  loss::data<double, int> data(A, b);

  /* define the smooth loss */
  loss::logistic<double, int> loss(data);

  /* select and configure the desired solver */
  algorithm::gd<double, int> alg;
  alg.step_parameters(1.0);

  /* provide an initial vector to the solver, and solve the problem */
  const vector<double> x0{1, 1, 1};
  alg.initialize(x0);
  alg.solve(loss);

  /* print the result */
  cout << "Optimum: " << alg.getf() << '\n';
  cout << "Optimizer: [";
  for (const auto val : alg.getx())
    cout << val << ',';
  cout << "].\n";

  return 0;
}

As can be observed, after including the system libraries and polo in our C++ file, we follow the following five steps:

  1. Define the problem data,

  2. Define the smooth loss on the data,

  3. Select and configure the desired solver,

  4. Initialize the solver and run it to minimize the total loss, and,

  5. Print the result.

In Listing 1, we first define the problem data (data) using the following (dense) matrix (dmatrix) and vector

\[\begin{split}A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \,, \qquad b = \begin{bmatrix} -1 \\ 1 \\ 1 \end{bmatrix} \,,\end{split}\]

both of which have values of type double and indices of type int. Then, we pick a predefined loss of type logistic on data, which describes the smooth loss

\[\operatorname{f}(x) = \sum_{n=1}^{N} \log \left( 1 + \exp \left( -b_{n} \left \langle a_{n}, x \right \rangle \right) \right) \,,\]

where \(a_{n}\) and \(b_{n}\) are the \(n^{\text{th}}\) row of \(A\) and \(b\), respectively. Currently, polo supports several common smooth loss functions (see Loss Functions). In addition, we can define custom loss functions in just a few lines of code. We will discuss this in more detail in Defining Custom Loss Functions. After defining the smooth loss, we select a vanilla gradient descent (gd) algorithm, and set its constant step size. In fact, gd is just an alias for a specific algorithm among a number of well-known optimization algorithms that polo supports from the proximal gradient family. polo not only allows users to configure, modify, and extend these algorithms in different ways but also supports the creation of completely new algorithms. We will describe this functionality in detail later in Proximal Gradient Methods. After configuring the solver, we initialize it with a decision vector, \(x_{0} = {[1, 1, 1]}^{\top}\), and solve the optimization problem defined on loss. Finally, we get the final decision vector (getx) that the gradient descent algorithm produces, together with the associated loss value (getf), and print the result.