Loading Data¶
In Anatomy of a POLO Program, we have seen how to manually define problem data (cf.
matrix \(A\) and vector \(b\) in Listing 1). This approach
is appropriate for defining small data sets; however, it gets impractical for
larger data sets that many machine-learning problems involve. To facilitate
convenient experimentation on machine-learning problems, polo
includes
functionality 1 for reading data from common formats such as LIBSVM [2011-Chang].
Now, we briefly demonstrate how polo
can be used to solve a logistic
regression problem on data in the LIBSVM format. To this end, we revisit the
example in Listing 1. Instead of working on the \(3\times3\)
data matrix, we download
australian_scale
file from australian
data set to data
folder under $HOME/examples
. The data set contains 690
samples, each of which has 14 features and belongs to either of the two
classes: -1 or +1. Next, we change the data-related parts of
Listing 1 to obtain Listing 2 (modifications are
highlighted).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | /* include system libraries */
#include <iostream>
#include <vector>
using namespace std;
/* include polo */
#include <polo/polo.hpp>
using namespace polo;
int main(int argc, char *argv[]) {
/* define the problem data */
auto data =
utility::reader<double, int>::svm({"../data/australian_scale"}, 690, 14);
/* define the smooth loss */
loss::logistic<double, int> loss(data);
/* estimate smoothness of the loss */
double rowmax{0};
for (int row = 0; row < data.nsamples(); row++) {
double rowsquare{0};
for (const auto val : data.matrix()->getrow(row))
rowsquare += val * val;
if (rowsquare > rowmax)
rowmax = rowsquare;
}
const double L = 0.25 * data.nsamples() * rowmax;
/* select and configure the desired solver */
algorithm::gd<double, int> alg;
alg.step_parameters(2 / L);
/* provide an initial vector to the solver, and solve the problem */
const vector<double> x0(data.nfeatures());
alg.initialize(x0);
alg.solve(loss);
/* print the result */
cout << "Optimum: " << alg.getf() << '\n';
cout << "Optimizer: [";
for (const auto val : alg.getx())
cout << val << ',';
cout << "].\n";
return 0;
}
|
As can be observed, we first call svm
reader on the data file, providing
the number of samples and features the file contains, and assign its output to
the automatic variable data
. Then, we try to approximate the Lipschitz
smoothness of the logistic loss defined on data. This is because we are still
using the vanilla gradient descent algorithm with a constant step size, and it
is known that the iterates of this algorithm converge to the optimum (for
convex functions) if the step size satisfies \(\gamma < 2/L\), where
\(L\) is the smoothness parameter [2004-Nesterov]. Because the logistic
loss is defined as
and each \(\operatorname{f_{n}}(x)\) is \(L_{n}\)-smooth with \(L_{n} = {\Vert a_{n} \Vert}_{2}^{2} / 4\) [2014-Xiao], a computationally cheap (albeit conservative) estimate for \(L\) is
Finally, we set the step size of the algorithm, initialize it with a
zero-vector of appropriate dimension, and run the algorithm. To compile
Listing 2, we add the following lines to our previous
CMakeLists.txt
:
add_executable(svmdata svmdata.cpp)
target_link_libraries(svmdata polo::polo)
Building the executable using CMake and running the resulting program give:
Optimum: 229.222
Optimizer: [0.0110083,0.162899,0.0832372,0.627515,0.968077,0.328978,0.257715,1.69923,0.556535,0.157199,-0.143509,0.328954,-0.358702,0.179352,].
- 2004-Nesterov
Yurii E. Nesterov. Introductory Lectures on Convex Optimization. Springer US, 2004. DOI: 10.1007/978-1-4419-8853-9.
- 2011-Chang
Chih-Chung Chang and Chih-Jen Lin. “LIBSVM.” ACM Transactions on Intelligent Systems and Technology 2.3 (Apr. 2011), pp. 1-27. DOI: 10.1145/1961189.1961199.
- 2014-Xiao
Lin Xiao and Tong Zhang. “A Proximal Stochastic Gradient Method with Progressive Variance Reduction.” SIAM Journal on Optimization 24.4 (Jan. 2014), pp. 2057–2075. DOI: 10.1137/140961791.
Footnotes
- 1
We will cover this functionality in more detail in Data Handling.