Anatomy of a POLO Program¶
Programs that use polo
are written in a single C++ file, as shown in
Listing 1.
Note
All the code samples presented in this documentation are provided under
docs/examples
directory in the source tree. Listing captions give the
relative path to the sample file. For example, getting-started/anatomy.cpp
refers to anatomy.cpp
file under docs/examples/getting-started
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | /* include system libraries */
#include <iostream>
#include <vector>
using namespace std;
/* include polo */
#include <polo/polo.hpp>
using namespace polo;
int main(int argc, char *argv[]) {
/* define the problem data */
matrix::dmatrix<double, int> A(3, 3, {1, 0, 0, 0, 1, 0, 0, 0, 1});
vector<double> b{-1, 1, 1};
loss::data<double, int> data(A, b);
/* define the smooth loss */
loss::logistic<double, int> loss(data);
/* select and configure the desired solver */
algorithm::gd<double, int> alg;
alg.step_parameters(1.0);
/* provide an initial vector to the solver, and solve the problem */
const vector<double> x0{1, 1, 1};
alg.initialize(x0);
alg.solve(loss);
/* print the result */
cout << "Optimum: " << alg.getf() << '\n';
cout << "Optimizer: [";
for (const auto val : alg.getx())
cout << val << ',';
cout << "].\n";
return 0;
}
|
As can be observed, after including the system libraries and polo
in our
C++ file, we follow the following five steps:
Define the problem data,
Define the smooth loss on the data,
Select and configure the desired solver,
Initialize the solver and run it to minimize the total loss, and,
Print the result.
In Listing 1, we first define the problem data (data
) using
the following (dense) matrix (dmatrix
) and vector
both of which have values of type double
and indices of type int
. Then,
we pick a predefined loss of type logistic
on data
, which describes the
smooth loss
where \(a_{n}\) and \(b_{n}\) are the \(n^{\text{th}}\) row of
\(A\) and \(b\), respectively. Currently, polo
supports several
common smooth loss functions (see Loss Functions). In addition, we can
define custom loss functions in just a few lines of code. We will discuss this
in more detail in Defining Custom Loss Functions. After defining the smooth loss, we select
a vanilla gradient descent (gd
) algorithm, and set its constant step size.
In fact, gd
is just an alias for a specific algorithm among a number of
well-known optimization algorithms that polo
supports from the proximal
gradient family. polo
not only allows users to configure, modify, and
extend these algorithms in different ways but also supports the creation of
completely new algorithms. We will describe this functionality in detail later
in Proximal Gradient Methods. After configuring the solver, we initialize
it with
a decision vector, \(x_{0} = {[1, 1, 1]}^{\top}\), and solve
the
optimization problem defined on loss
. Finally, we get the final decision
vector (getx
) that the gradient descent algorithm produces, together with
the associated loss value (getf
), and print the result.