.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/example1_2DPeakFunction.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_example1_2DPeakFunction.py: Surrogate Models ================ Some industrial applications require modeling complex processes that can result either in highly nonlinear functions or functions defined by a simulation process. In those contexts, optimization solvers often struggle. The reason may be that relaxations of the nonlinear functions are not good enough to make the solver prove an acceptable bound in a reasonable amount of time. Another issue may be that the solver is not able to represent the functions. An approach that has been proposed in the literature is to approximate the problematic nonlinear functions via neural networks with ReLU activation and use MIP technology to solve the constructed approximation (see e.g. \ `Heneao Maravelias 2011 `__\ , `Schweitdmann et.al. 2022 `__\ ). This use of neural networks can be motivated by their ability to provide a universal approximation (see e.g. \ `Lu et.al. 2017 `__\ ). This use of ML models to replace complex processes is often referred to as *surrogate models*. In the following example, we approximate a nonlinear function via ``Scikit-learn`` ``MLPRegressor`` and then solve an optimization problem that uses the approximation of the nonlinear function with Gurobi. The purpose of this example is solely illustrative and doesn’t relate to any particular application. The function we approximate is the `2D peaks function `__. The function is given as .. math:: \begin{aligned} f(x) = & 3 \cdot (1-x_1)^2 \cdot \exp(-x_1^2 - (x_2+1)^2) - \\ & 10 \cdot (\frac{x_1}{5} - x_1^3 - x_2^5) \cdot \exp(-x_1^2 - x_2^2) - \\ & \frac{1}{3} \cdot \exp(-(x_1+1)^2 - x_2^2). \end{aligned} In this example, we want to find the minimum of :math:`f` over the interval :math:`[-2, 2]^2`: .. math:: y = \min \{f(x) : x \in [-2,2]^2\}. The `global minimum of this problem can be found numerically `__ to have value :math:`-6.55113` at the point :math:`(0.2283, -1.6256)`. Here to find this minimum of :math:`f`, we approximate :math:`f(x)` through a neural network function :math:`g(x)` to obtain a MIP and solve .. math:: \hat y = \min \{g(x) : x \in [-2,2]^2\} \approx y. First import the necessary packages. Before applying the neural network, we do a preprocessing to extract polynomial features of degree 2. Hopefully this will help us to approximate the smooth function. Besides, ``gurobipy``, ``numpy`` and the appropriate ``sklearn`` objects, we also use ``matplotlib`` to plot the function, and its approximation. .. GENERATED FROM PYTHON SOURCE LINES 66-79 .. code-block:: Python import gurobipy as gp import numpy as np from gurobipy import GRB from matplotlib import cm from matplotlib import pyplot as plt from sklearn import metrics from sklearn.neural_network import MLPRegressor from sklearn.pipeline import make_pipeline from sklearn.preprocessing import PolynomialFeatures from gurobi_ml import add_predictor_constr .. GENERATED FROM PYTHON SOURCE LINES 80-85 Define the nonlinear function of interest ----------------------------------------- We define the 2D peak function as a python function. .. GENERATED FROM PYTHON SOURCE LINES 85-95 .. code-block:: Python def peak2d(x1, x2): return ( 3 * (1 - x1) ** 2.0 * np.exp(-(x1**2) - (x2 + 1) ** 2) - 10 * (x1 / 5 - x1**3 - x2**5) * np.exp(-(x1**2) - x2**2) - 1 / 3 * np.exp(-((x1 + 1) ** 2) - x2**2) ) .. GENERATED FROM PYTHON SOURCE LINES 96-102 To train the neural network, we make a uniform sample of the domain of the function in the region of interest using ``numpy``\ ’s ``arrange`` function. We then plot the function with ``matplotlib``. .. GENERATED FROM PYTHON SOURCE LINES 102-114 .. code-block:: Python x1, x2 = np.meshgrid(np.arange(-2, 2, 0.01), np.arange(-2, 2, 0.01)) y = peak2d(x1, x2) fig, ax = plt.subplots(subplot_kw={"projection": "3d"}) # Plot the surface. surf = ax.plot_surface(x1, x2, y, cmap=cm.coolwarm, linewidth=0.01, antialiased=False) # Add a color bar which maps values to colors. fig.colorbar(surf, shrink=0.5, aspect=5) plt.show() .. image-sg:: /auto_examples/images/sphx_glr_example1_2DPeakFunction_001.png :alt: example1 2DPeakFunction :srcset: /auto_examples/images/sphx_glr_example1_2DPeakFunction_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 115-121 Approximate the function ------------------------ To fit a model, we need to reshape our data. We concatenate the values of ``x1`` and ``x2`` in an array ``X`` and make ``y`` one dimensional. .. GENERATED FROM PYTHON SOURCE LINES 121-126 .. code-block:: Python X = np.concatenate([x1.ravel().reshape(-1, 1), x2.ravel().reshape(-1, 1)], axis=1) y = y.ravel() .. GENERATED FROM PYTHON SOURCE LINES 127-131 To approximate the function, we use a ``Pipeline`` with polynomial features and a neural-network regressor. We do a relatively small neural-network. .. GENERATED FROM PYTHON SOURCE LINES 131-139 .. code-block:: Python # Run our regression layers = [30] * 2 regression = MLPRegressor(hidden_layer_sizes=layers, activation="relu") pipe = make_pipeline(PolynomialFeatures(), regression) pipe.fit(X=X, y=y) .. raw:: html
Pipeline(steps=[('polynomialfeatures', PolynomialFeatures()),
                    ('mlpregressor', MLPRegressor(hidden_layer_sizes=[30, 30]))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


.. GENERATED FROM PYTHON SOURCE LINES 140-143 To test the accuracy of the approximation, we take a random sample of points, and we print the :math:`R^2` value and the maximal error. .. GENERATED FROM PYTHON SOURCE LINES 143-151 .. code-block:: Python X_test = np.random.random((100, 2)) * 4 - 2 r2_score = metrics.r2_score(peak2d(X_test[:, 0], X_test[:, 1]), pipe.predict(X_test)) max_error = metrics.max_error(peak2d(X_test[:, 0], X_test[:, 1]), pipe.predict(X_test)) print("R2 error {}, maximal error {}".format(r2_score, max_error)) .. rst-class:: sphx-glr-script-out .. code-block:: none R2 error 0.9997993073819972, maximal error 0.08785106844209592 .. GENERATED FROM PYTHON SOURCE LINES 152-156 While the :math:`R^2` value is good, the maximal error is quite high. For the purpose of this example we still deem it acceptable. We plot the function. .. GENERATED FROM PYTHON SOURCE LINES 156-172 .. code-block:: Python fig, ax = plt.subplots(subplot_kw={"projection": "3d"}) # Plot the surface. surf = ax.plot_surface( x1, x2, pipe.predict(X).reshape(x1.shape), cmap=cm.coolwarm, linewidth=0.01, antialiased=False, ) # Add a color bar which maps values to colors. fig.colorbar(surf, shrink=0.5, aspect=5) plt.show() .. image-sg:: /auto_examples/images/sphx_glr_example1_2DPeakFunction_002.png :alt: example1 2DPeakFunction :srcset: /auto_examples/images/sphx_glr_example1_2DPeakFunction_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 173-185 Visually, the approximation looks close enough to the original function. Build and Solve the Optimization Model -------------------------------------- We now turn to the optimization model. For this model we want to find the minimal value of ``y_approx`` which is the approximation given by our pipeline on the interval. Note that in this simple example, we don’t use matrix variables but regular Gurobi variables instead. .. GENERATED FROM PYTHON SOURCE LINES 185-199 .. code-block:: Python m = gp.Model() x = m.addVars(2, lb=-2, ub=2, name="x") y_approx = m.addVar(lb=-GRB.INFINITY, name="y") m.setObjective(y_approx, gp.GRB.MINIMIZE) # add "surrogate constraint" pred_constr = add_predictor_constr(m, pipe, x, y_approx) pred_constr.print_stats() .. rst-class:: sphx-glr-script-out .. code-block:: none Restricted license - for non-production use only - expires 2026-11-23 Warning for adding constraints: zero or small (< 1e-13) coefficients, ignored Model for pipe: 126 variables 61 constraints 6 quadratic constraints 60 general constraints Input has shape (1, 2) Output has shape (1, 1) Pipeline has 2 steps: -------------------------------------------------------------------------------- Step Output Shape Variables Constraints Linear Quadratic General ================================================================================ poly_feat (1, 6) 6 0 6 0 dense (1, 30) 60 30 0 30 (relu) dense0 (1, 30) 60 30 0 30 (relu) dense1 (1, 1) 0 1 0 0 -------------------------------------------------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 200-204 Now call ``optimize``. Since we use polynomial features the resulting model is a non-convex quadratic problem. In Gurobi, we need to set the parameter ``NonConvex`` to 2 to be able to solve it. .. GENERATED FROM PYTHON SOURCE LINES 204-212 .. code-block:: Python m.Params.TimeLimit = 20 m.Params.MIPGap = 0.1 m.Params.NonConvex = 2 m.optimize() .. rst-class:: sphx-glr-script-out .. code-block:: none Set parameter TimeLimit to value 20 Set parameter MIPGap to value 0.1 Set parameter NonConvex to value 2 Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Ubuntu 24.04 LTS") CPU model: AMD EPYC 9R14, instruction set [SSE2|AVX|AVX2|AVX512] Thread count: 2 physical cores, 2 logical processors, using up to 2 threads Non-default parameters: TimeLimit 20 MIPGap 0.1 NonConvex 2 Optimize a model with 61 rows, 129 columns and 1099 nonzeros Model fingerprint: 0x17ecfd0f Model has 6 quadratic constraints Model has 60 simple general constraints 60 MAX Variable types: 129 continuous, 0 integer (0 binary) Coefficient statistics: Matrix range [6e-12, 1e+00] QMatrix range [1e+00, 1e+00] QLMatrix range [1e+00, 1e+00] Objective range [1e+00, 1e+00] Bounds range [2e+00, 2e+00] RHS range [7e-04, 7e-01] QRHS range [1e+00, 1e+00] Presolve added 81 rows and 18 columns Presolve time: 0.01s Presolved: 152 rows, 148 columns, 1275 nonzeros Presolved model has 3 bilinear constraint(s) Solving non-convex MIQCP Variable types: 106 continuous, 42 integer (42 binary) Root relaxation: objective -4.548671e+01, 185 iterations, 0.00 seconds (0.00 work units) Nodes | Current Node | Objective Bounds | Work Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time 0 0 -45.48671 0 35 - -45.48671 - - 0s 0 0 -45.47859 0 35 - -45.47859 - - 0s 0 0 -40.96383 0 34 - -40.96383 - - 0s 0 0 -40.92120 0 31 - -40.92120 - - 0s 0 0 -40.68851 0 31 - -40.68851 - - 0s 0 0 -37.08948 0 39 - -37.08948 - - 0s 0 0 -37.06369 0 38 - -37.06369 - - 0s 0 0 -36.85489 0 38 - -36.85489 - - 0s 0 0 -35.88050 0 37 - -35.88050 - - 0s 0 0 -35.72421 0 38 - -35.72421 - - 0s 0 0 -34.51554 0 39 - -34.51554 - - 0s 0 0 -34.42047 0 38 - -34.42047 - - 0s 0 0 -33.92112 0 40 - -33.92112 - - 0s 0 0 -33.77023 0 40 - -33.77023 - - 0s 0 0 -33.27463 0 40 - -33.27463 - - 0s 0 0 -33.10233 0 39 - -33.10233 - - 0s 0 0 -32.79373 0 37 - -32.79373 - - 0s 0 0 -32.53702 0 39 - -32.53702 - - 0s 0 0 -32.31568 0 39 - -32.31568 - - 0s 0 0 -32.05790 0 40 - -32.05790 - - 0s 0 0 -31.72212 0 41 - -31.72212 - - 0s 0 0 -31.48236 0 41 - -31.48236 - - 0s 0 0 -31.19512 0 40 - -31.19512 - - 0s 0 0 -31.15436 0 40 - -31.15436 - - 0s 0 0 -30.82069 0 41 - -30.82069 - - 0s 0 0 -30.75165 0 41 - -30.75165 - - 0s 0 0 -30.12098 0 42 - -30.12098 - - 0s 0 0 -30.06734 0 41 - -30.06734 - - 0s 0 0 -29.88560 0 40 - -29.88560 - - 0s 0 0 -29.87054 0 39 - -29.87054 - - 0s 0 0 -29.74081 0 39 - -29.74081 - - 0s 0 0 -29.69860 0 41 - -29.69860 - - 0s 0 0 -29.53893 0 40 - -29.53893 - - 0s 0 0 -29.51088 0 40 - -29.51088 - - 0s 0 0 -29.41302 0 40 - -29.41302 - - 0s 0 0 -29.41292 0 40 - -29.41292 - - 0s H 0 0 0.5807386 -29.41292 5165% - 0s 0 2 -29.41292 0 40 0.58074 -29.41292 5165% - 0s H 158 133 -4.7135397 -22.49306 377% 24.8 0s * 175 106 39 -6.3531056 -22.34422 252% 22.8 0s * 177 106 40 -6.4903985 -22.34422 244% 22.6 0s * 178 106 41 -6.5295060 -22.34422 242% 22.5 0s * 180 106 42 -6.5322696 -22.34422 242% 22.2 0s * 182 106 43 -6.5390385 -22.34422 242% 22.0 0s * 186 106 45 -6.5398994 -22.34422 242% 21.5 0s * 200 106 41 -6.5412492 -22.22684 240% 20.7 0s * 201 106 42 -6.5473878 -22.22684 239% 20.6 0s H 338 162 -6.5625461 -19.84152 202% 20.8 0s H 1373 216 -6.5668899 -10.08165 53.5% 19.3 0s * 1860 0 41 -6.5669714 -6.56707 0.00% 18.1 0s Cutting planes: Gomory: 6 Implied bound: 36 MIR: 144 Flow cover: 36 Relax-and-lift: 1 Explored 1869 nodes (35299 simplex iterations) in 0.84 seconds (1.28 work units) Thread count was 2 (of 2 available processors) Solution count 10: -6.56697 -6.56689 -6.56255 ... -6.4904 No other solutions better than -6.56697 Optimal solution found (tolerance 1.00e-01) Best objective -6.566971434052e+00, best bound -6.566971434052e+00, gap 0.0000% .. GENERATED FROM PYTHON SOURCE LINES 213-216 After solving the model, we check the error in the estimate of the Gurobi solution. .. GENERATED FROM PYTHON SOURCE LINES 216-224 .. code-block:: Python print( "Maximum error in approximating the regression {:.6}".format( np.max(pred_constr.get_error()) ) ) .. rst-class:: sphx-glr-script-out .. code-block:: none Maximum error in approximating the regression 1.06986e-06 .. GENERATED FROM PYTHON SOURCE LINES 225-227 Finally, we look at the solution and the objective value found. .. GENERATED FROM PYTHON SOURCE LINES 227-237 .. code-block:: Python print( f"solution point of the approximated problem ({x[0].X:.4}, {x[1].X:.4}), " + f"objective value {m.ObjVal}." ) print( f"Function value at the solution point {peak2d(x[0].X, x[1].X)} error {abs(peak2d(x[0].X, x[1].X) - m.ObjVal)}." ) .. rst-class:: sphx-glr-script-out .. code-block:: none solution point of the approximated problem (0.2633, -1.592), objective value -6.566971434051509. Function value at the solution point -6.527357442069412 error 0.039613991982097474. .. GENERATED FROM PYTHON SOURCE LINES 238-244 The difference between the function and the approximation at the computed solution point is noticeable, but the point we found is reasonably close to the actual global minimum. Depending on the use case this might be deemed acceptable. Of course, training a larger network should result in a better approximation. .. GENERATED FROM PYTHON SOURCE LINES 247-249 Copyright © 2023 Gurobi Optimization, LLC .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 5.848 seconds) .. _sphx_glr_download_auto_examples_example1_2DPeakFunction.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: example1_2DPeakFunction.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: example1_2DPeakFunction.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: example1_2DPeakFunction.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_