In this page, we provide a simple example of using the Gurobi Machine
Learning package.
The example is entirely abstract. Its aim is only to illustrate the
basic functionalities of the package in the most simple way. For some
more realistic applications, please refer to the notebooks in the
examples section.
Before proceeding to the example itself, we need to import a number of
packages. Here, we will use Scikit-learn to train regression models. We
generate random data for the regression using the
make_regression
function. For the regression model, we use a multi-layer perceptron
regressor
neural network. We import the corresponding objects.
Certainly, we need gurobipy to build an optimization model and from the
gurobi_ml package we need the
add_predictor_constr.
function. We also need numpy.
We start by building artificial data to train our regressions. To do so,
we use make_regression to obtain data with 10 features.
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
We now turn to the optimization model. In the spirit of adversarial
machine learning examples, we use some training examples. We pick
\(n\) training examples randomly. For each of the examples, we want
to find an input that is in a small neighborhood of it that leads to the
output that is closer to \(0\) with the regression.
Denoting by \(X^E\) our set of examples and by \(g\) the
prediction function of our regression model, our optimization problem
reads:
where \(X\) is a matrix of variables of dimension
\(n \times 10\) (the number of examples we consider and number of
features in the regression respectively), \(y\) is a vector of free
(unbounded) variables and \(\delta\) a small positive constant.
First, let’s pick randomly 2 training examples using numpy, and create
our Gurobimodel.
Our only decision variables in this case, are the five inputs and
outputs for the regression. We use MVar matrix variables
that are most convenient in this case.
The input variables have the same shape as X_examples. Their lower
bound is X_examples-delta and their upper bound
X_examples+delta.
The output variables have the shape of y_examples and are unbounded.
By default, in Gurobi variables are non-negative, we therefore need to
set an infinite lower bound.
Gurobi Optimizer version 13.0.2 build v13.0.2rc1 (linux64 - "Ubuntu 24.04 LTS")
CPU model: AMD EPYC 7R13 Processor, instruction set [SSE2|AVX|AVX2]
Thread count: 1 physical cores, 2 logical processors, using up to 2 threads
Optimize a model with 82 rows, 182 columns and 1322 nonzeros (Min)
Model fingerprint: 0xdd40d80a
Model has 0 linear objective coefficients
Model has 2 quadratic objective terms
Model has 80 simple general constraints
80 MAX
Variable types: 182 continuous, 0 integer (0 binary)
Coefficient statistics:
Matrix range [8e-04, 2e+00]
Objective range [0e+00, 0e+00]
QObjective range [2e+00, 2e+00]
Bounds range [2e-03, 3e+00]
RHS range [4e-03, 1e+00]
Presolve removed 12 rows and 102 columns
Presolve time: 0.01s
Presolved: 70 rows, 80 columns, 502 nonzeros
Presolved model has 1 quadratic objective terms
Variable types: 59 continuous, 21 integer (21 binary)
Root relaxation: objective 1.289870e+05, 93 iterations, 0.00 seconds (0.00 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 128986.971 0 13 - 128986.971 - - 0s
H 0 0 128986.97117 128986.971 0.00% - 0s
Explored 1 nodes (93 simplex iterations) in 0.01 seconds (0.01 work units)
Thread count was 2 (of 2 available processors)
Solution count 1: 128987
Optimal solution found (tolerance 1.00e-04)
Best objective 1.289869711748e+05, best bound 1.289869711748e+05, gap 0.0000%
The method
get_error
is useful to check that the solution computed by Gurobi is correct with
respect to the regression model we use.
Let \((\bar X, \bar y)\) be the values of the input and output
variables in the computed solution. The function returns
\(g(\bar X) - y\) using the original regression object.
Normally, all values should be small and below Gurobi’s tolerances in this example.