.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/example2_student_admission.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_example2_student_admission.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_example2_student_admission.py:


Student Enrollment
==================

In this example, we show how to reproduce the model of student
enrollment from Bergman et.al. (2020) with Gurobi Machine Learning.

This model was developed in the context of the development of
`Janos <https://github.com/INFORMSJoC/2020.1023>`__, a toolkit similar
to Gurobi Machine Learning to integrate ML models and Mathematical
Optimization.

This example illustrates in particular how to use the logistic
regression.

We also show how to deal with fixed features in the optimization model
using pandas data frames.

In this model, data of students admissions in a college is used to
predict the probability that a student enrolls to the college.

The data has 3 features: the SAT and GPA scores of each student, and the
scholarship (or merit) that was offered to each student. Finally, it is
known if each student decided to join the college or not.

Based on this data a logistic regression is trained to predict the
probability that a student joins the college.

Using this regression model, Bergman et.al. (2020) proposes the
following student enrollment problem. The Admission Office has data for
SAT and GPA scores of the admitted students for the incoming class, and
they would want to offer scholarships to students with the goal of
maximizing the expected number of students that enroll in the college.
There is a total of :math:`n` students that are admitted. The maximal
budget for the sum of all scholarships offered is
:math:`0.2 n \, \text{K\$}` and each student can be offered a
scholarship of at most :math:`2.5 \, \text{K\$}`.

This problem can be expressed as a mathematical optimization problem as
follows. Two vectors of decision variables :math:`x` and :math:`y` of
dimension :math:`n` are used to model respectively the scholarship
offered to each student in :math:`\text{K\$}` and the probability that
they join. Denoting by :math:`g` the prediction function for the
probability of the logistic regression we then have for each student
:math:`i`:

.. math::  y_i = g(x_i, SAT_i, GPA_i),

with :math:`SAT_i` and :math:`GPA_i` the (known) SAT and GPA score of
each student.

The objective is to maximize the sum of the :math:`y` variables and the
budget constraint imposes that the sum of the variables :math:`x` is
less or equal to :math:`0.2n`. Also, each variable :math:`x_i` is
between 0 and 2.5.

The full model then reads:

.. math::

    \begin{aligned} &\max \sum_{i=1}^n y_i \\
   &\text{subject to:}\\
   &\sum_{i=1}^n x_i \le 0.2*n,\\
   &y_i = g(x_i, SAT_i, GPA_i) & & i = 1, \ldots, n,\\
   & 0 \le x \le 2.5. \end{aligned}

Note that in this example differently to Bergman et.al. (2020) we scale
the features for the regression. Also, to fit in Gurobi’s limited size
license we only consider the problem where :math:`n=250`.

We note also that the model may differ from the objectives of Admission
Offices and don’t encourage its use in real life. The example is for
illustration purposes only.

Importing packages and retrieving the data
------------------------------------------

We import the necessary packages. Besides the usual (``numpy``,
``gurobipy``, ``pandas``), for this we will use Scikit-learn’s Pipeline,
StandardScaler and LogisticRegression.

.. GENERATED FROM PYTHON SOURCE LINES 83-98

.. code-block:: Python


    import sys

    import gurobipy as gp
    import gurobipy_pandas as gppd
    import numpy as np
    import pandas as pd
    from matplotlib import pyplot as plt
    from sklearn.linear_model import LogisticRegression
    from sklearn.pipeline import make_pipeline
    from sklearn.preprocessing import StandardScaler
    from sklearn.tree import DecisionTreeRegressor

    from gurobi_ml import add_predictor_constr


.. GENERATED FROM PYTHON SOURCE LINES 99-106

We now retrieve the historical data used to build the regression from
Janos repository.

The features we use for the regression are ``"merit"`` (scholarship),
``"SAT"`` and ``"GPA"`` and the target is ``"enroll"``. We store those
values.


.. GENERATED FROM PYTHON SOURCE LINES 106-119

.. code-block:: Python


    # Base URL for retrieving data
    janos_data_url = "https://raw.githubusercontent.com/INFORMSJoC/2020.1023/master/data/"
    historical_data = pd.read_csv(
        janos_data_url + "college_student_enroll-s1-1.csv", index_col=0
    )

    # classify our features between the ones that are fixed and the ones that will be
    # part of the optimization problem
    features = ["merit", "SAT", "GPA"]
    target = "enroll"


.. GENERATED FROM PYTHON SOURCE LINES 120-127

Fit the logistic regression
---------------------------

For the regression, we use a pipeline with a standard scaler and a
logistic regression. We build it using the ``make_pipeline`` from
``scikit-learn``.


.. GENERATED FROM PYTHON SOURCE LINES 127-135

.. code-block:: Python


    # Run our regression
    scaler = StandardScaler()
    regression = LogisticRegression(random_state=1)
    pipe = make_pipeline(scaler, regression)
    pipe.fit(X=historical_data.loc[:, features], y=historical_data.loc[:, target])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-2 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: black;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;

      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));
      --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));
      --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));
      --sklearn-color-icon: #696969;

      @media (prefers-color-scheme: dark) {
        /* Redefinition of color scheme for dark theme */
        --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));
        --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));
        --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));
        --sklearn-color-icon: #878787;
      }
    }

    #sk-container-id-2 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-2 pre {
      padding: 0;
    }

    #sk-container-id-2 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-2 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-2 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-2 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-2 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-2 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-2 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-2 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-2 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-2 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-2 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-2 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-2 label.sk-toggleable__label {
      cursor: pointer;
      display: block;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
    }

    #sk-container-id-2 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-2 div.sk-toggleable__content {
      max-height: 0;
      max-width: 0;
      overflow: hidden;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      max-height: 200px;
      max-width: 100%;
      overflow: auto;
    }

    #sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-2 div.sk-label label.sk-toggleable__label,
    #sk-container-id-2 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-2 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-2 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      display: inline-block;
      line-height: 1.2em;
    }

    #sk-container-id-2 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-2 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-2 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-background);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 1ex;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
      color: var(--sklearn-color-unfitted-level-1);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-2 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-background);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-2 a.estimator_doc_link.fitted {
      /* fitted */
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-2 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-2 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }
    </style><div id="sk-container-id-2" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),
                    (&#x27;logisticregression&#x27;, LogisticRegression(random_state=1))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-4" type="checkbox" ><label for="sk-estimator-id-4" class="sk-toggleable__label fitted sk-toggleable__label-arrow fitted">&nbsp;&nbsp;Pipeline<a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.5/modules/generated/sklearn.pipeline.Pipeline.html">?<span>Documentation for Pipeline</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></label><div class="sk-toggleable__content fitted"><pre>Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),
                    (&#x27;logisticregression&#x27;, LogisticRegression(random_state=1))])</pre></div> </div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-5" type="checkbox" ><label for="sk-estimator-id-5" class="sk-toggleable__label fitted sk-toggleable__label-arrow fitted">&nbsp;StandardScaler<a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.5/modules/generated/sklearn.preprocessing.StandardScaler.html">?<span>Documentation for StandardScaler</span></a></label><div class="sk-toggleable__content fitted"><pre>StandardScaler()</pre></div> </div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-6" type="checkbox" ><label for="sk-estimator-id-6" class="sk-toggleable__label fitted sk-toggleable__label-arrow fitted">&nbsp;LogisticRegression<a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.5/modules/generated/sklearn.linear_model.LogisticRegression.html">?<span>Documentation for LogisticRegression</span></a></label><div class="sk-toggleable__content fitted"><pre>LogisticRegression(random_state=1)</pre></div> </div></div></div></div></div></div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 136-144

Optimization Model
~~~~~~~~~~~~~~~~~~

We now turn to building the mathematical optimization model for Gurobi.

First, retrieve the data for the new students. We won’t use all the data
there, we randomly pick 250 students from it.


.. GENERATED FROM PYTHON SOURCE LINES 144-154

.. code-block:: Python


    # Retrieve new data used to build the optimization problem
    studentsdata = pd.read_csv(janos_data_url + "college_applications6000.csv", index_col=0)

    nstudents = 25

    # Select randomly nstudents in the data
    studentsdata = studentsdata.sample(nstudents, random_state=1)


.. GENERATED FROM PYTHON SOURCE LINES 155-161

We can now create the our model.

Since our data is in pandas data frames, we use the package
gurobipy-pandas to help create the variables directly using the index of
the data frame.


.. GENERATED FROM PYTHON SOURCE LINES 161-187

.. code-block:: Python


    # Start with classical part of the model
    m = gp.Model()

    # The y variables are modeling the probability of enrollment of each student. They are indexed by students data
    y = gppd.add_vars(m, studentsdata, name="enroll_probability")


    # We want to complete studentsdata with a column of decision variables to model the "merit" feature.
    # Those variable are between 0 and 2.5.
    # They are added using the gppd extension and the resulting dataframe is stored in
    # students_opt_data.
    students_opt_data = studentsdata.gppd.add_vars(m, lb=0.0, ub=2.5, name="merit")

    # We denote by x the (variable) "merit" feature
    x = students_opt_data.loc[:, "merit"]

    # Make sure that studentsdata contains only the features column and in the right order
    students_opt_data = students_opt_data.loc[:, features]

    m.update()

    # Let's look at our features dataframe for the optimization
    students_opt_data[:10]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>merit</th>
          <th>SAT</th>
          <th>GPA</th>
        </tr>
        <tr>
          <th>StudentID</th>
          <th></th>
          <th></th>
          <th></th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>1484</th>
          <td>&lt;gurobi.Var merit[1484]&gt;</td>
          <td>1512</td>
          <td>3.61</td>
        </tr>
        <tr>
          <th>2186</th>
          <td>&lt;gurobi.Var merit[2186]&gt;</td>
          <td>1148</td>
          <td>3.06</td>
        </tr>
        <tr>
          <th>2521</th>
          <td>&lt;gurobi.Var merit[2521]&gt;</td>
          <td>1090</td>
          <td>2.76</td>
        </tr>
        <tr>
          <th>3722</th>
          <td>&lt;gurobi.Var merit[3722]&gt;</td>
          <td>1044</td>
          <td>2.55</td>
        </tr>
        <tr>
          <th>3728</th>
          <td>&lt;gurobi.Var merit[3728]&gt;</td>
          <td>1424</td>
          <td>3.64</td>
        </tr>
        <tr>
          <th>4525</th>
          <td>&lt;gurobi.Var merit[4525]&gt;</td>
          <td>1040</td>
          <td>2.44</td>
        </tr>
        <tr>
          <th>235</th>
          <td>&lt;gurobi.Var merit[235]&gt;</td>
          <td>1030</td>
          <td>2.61</td>
        </tr>
        <tr>
          <th>4736</th>
          <td>&lt;gurobi.Var merit[4736]&gt;</td>
          <td>1399</td>
          <td>3.42</td>
        </tr>
        <tr>
          <th>5840</th>
          <td>&lt;gurobi.Var merit[5840]&gt;</td>
          <td>1090</td>
          <td>2.54</td>
        </tr>
        <tr>
          <th>2940</th>
          <td>&lt;gurobi.Var merit[2940]&gt;</td>
          <td>1417</td>
          <td>3.69</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 188-190

We add the objective and the budget constraint:


.. GENERATED FROM PYTHON SOURCE LINES 190-197

.. code-block:: Python


    m.setObjective(y.sum(), gp.GRB.MAXIMIZE)

    m.addConstr(x.sum() <= 0.2 * nstudents)
    m.update()


.. GENERATED FROM PYTHON SOURCE LINES 198-208

Finally, we insert the constraints from the regression. In this model we
want to have use the probability estimate of a student joining the
college, so we choose the parameter ``output_type`` to be
``"probability_1"``. Note that due to the shapes of the ``studentsdata``
data frame and ``y``, this will insert one regression constraint for
each student.

With the ``print_stats`` function we display what was added to the
model.


.. GENERATED FROM PYTHON SOURCE LINES 208-216

.. code-block:: Python


    pred_constr = add_predictor_constr(
        m, pipe, students_opt_data, y, output_type="probability_1"
    )

    pred_constr.print_stats()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Model for pipe:
    150 variables
    100 constraints
    25 general constraints
    Input has shape (25, 3)
    Output has shape (25, 1)

    Pipeline has 2 steps:

    --------------------------------------------------------------------------------
    Step            Output Shape    Variables              Constraints              
                                                    Linear    Quadratic      General
    ================================================================================
    std_scaler           (25, 3)          125           75            0            0

    log_reg              (25, 1)           25           25            0           25

    --------------------------------------------------------------------------------


.. GENERATED FROM PYTHON SOURCE LINES 217-227

We can now optimize the problem. With Gurobi ≥ 11.0, the attribute
``FuncNonLinear`` is automatically set to 1 by Gurobi machine learning
on the nonlinear constraints it adds in order to deal algorithmically
with the logistic function.

Older versions of Gurobi would make a piece-wise linear approximation of
the logistic function. You can refer to `older versions of this
documentation <https://gurobi-machinelearning.readthedocs.io/en/v1.3.0/mlm-examples/student_admission.html>`__
for dealing with those approximations.


.. GENERATED FROM PYTHON SOURCE LINES 227-231

.. code-block:: Python


    m.optimize()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Gurobi Optimizer version 11.0.3 build v11.0.3rc0 (linux64 - "Ubuntu 20.04.6 LTS")

    CPU model: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz, instruction set [SSE2|AVX|AVX2|AVX512]
    Thread count: 1 physical cores, 2 logical processors, using up to 2 threads

    Optimize a model with 101 rows, 200 columns and 275 nonzeros
    Model fingerprint: 0x95d199bd
    Model has 25 general constraints
    Variable types: 200 continuous, 0 integer (0 binary)
    Coefficient statistics:
      Matrix range     [4e-01, 2e+02]
      Objective range  [1e+00, 1e+00]
      Bounds range     [2e+00, 2e+03]
      RHS range        [7e-01, 1e+03]
    Presolve removed 100 rows and 162 columns
    Presolve time: 0.00s
    Presolved: 132 rows, 39 columns, 291 nonzeros
    Presolved model has 19 nonlinear constraint(s)

    Solving non-convex MINLP

    Variable types: 39 continuous, 0 integer (0 binary)
    Found heuristic solution: objective 13.7903088

    Root relaxation: objective 1.385440e+01, 12 iterations, 0.00 seconds (0.00 work units)

        Nodes    |    Current Node    |     Objective Bounds      |     Work
     Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

         0     0   13.85440    0    7   13.79031   13.85440  0.46%     -    0s
         0     0   13.81174    0    7   13.79031   13.81174  0.16%     -    0s
         0     0   13.80064    0    7   13.79031   13.80064  0.07%     -    0s
         0     0   13.79856    0    7   13.79031   13.79856  0.06%     -    0s
         0     0   13.79785    0    7   13.79031   13.79785  0.05%     -    0s
         0     2   13.79785    0    7   13.79031   13.79785  0.05%     -    0s

    Explored 89 nodes (508 simplex iterations) in 0.04 seconds (0.01 work units)
    Thread count was 2 (of 2 available processors)

    Solution count 1: 13.7903 

    Optimal solution found (tolerance 1.00e-04)
    Best objective 1.379030883056e+01, best bound 1.379166027171e+01, gap 0.0098%


.. GENERATED FROM PYTHON SOURCE LINES 232-236

We print the error using
:func:`get_error<gurobi_ml.modeling.base_predictor_constr.AbstractPredictorConstr.get_error>`
(note that we take the maximal error over all input vectors).


.. GENERATED FROM PYTHON SOURCE LINES 236-244

.. code-block:: Python


    print(
        "Maximum error in approximating the regression {:.6}".format(
            np.max(pred_constr.get_error())
        )
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Maximum error in approximating the regression 3.74643e-09


.. GENERATED FROM PYTHON SOURCE LINES 245-248

Finally, note that we can directly get the input values for the
regression in a solution as a pandas dataframe using input_values.


.. GENERATED FROM PYTHON SOURCE LINES 248-252

.. code-block:: Python


    pred_constr.input_values


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>merit</th>
          <th>SAT</th>
          <th>GPA</th>
        </tr>
        <tr>
          <th>StudentID</th>
          <th></th>
          <th></th>
          <th></th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>1484</th>
          <td>3.597305e-07</td>
          <td>1512.0</td>
          <td>3.61</td>
        </tr>
        <tr>
          <th>2186</th>
          <td>7.387056e-07</td>
          <td>1148.0</td>
          <td>3.06</td>
        </tr>
        <tr>
          <th>2521</th>
          <td>0.000000e+00</td>
          <td>1090.0</td>
          <td>2.76</td>
        </tr>
        <tr>
          <th>3722</th>
          <td>0.000000e+00</td>
          <td>1044.0</td>
          <td>2.55</td>
        </tr>
        <tr>
          <th>3728</th>
          <td>3.597305e-07</td>
          <td>1424.0</td>
          <td>3.64</td>
        </tr>
        <tr>
          <th>4525</th>
          <td>0.000000e+00</td>
          <td>1040.0</td>
          <td>2.44</td>
        </tr>
        <tr>
          <th>235</th>
          <td>0.000000e+00</td>
          <td>1030.0</td>
          <td>2.61</td>
        </tr>
        <tr>
          <th>4736</th>
          <td>3.597469e-07</td>
          <td>1399.0</td>
          <td>3.42</td>
        </tr>
        <tr>
          <th>5840</th>
          <td>0.000000e+00</td>
          <td>1090.0</td>
          <td>2.54</td>
        </tr>
        <tr>
          <th>2940</th>
          <td>3.597305e-07</td>
          <td>1417.0</td>
          <td>3.69</td>
        </tr>
        <tr>
          <th>3054</th>
          <td>1.100257e+00</td>
          <td>1303.0</td>
          <td>3.26</td>
        </tr>
        <tr>
          <th>868</th>
          <td>0.000000e+00</td>
          <td>1062.0</td>
          <td>2.72</td>
        </tr>
        <tr>
          <th>277</th>
          <td>8.959954e-01</td>
          <td>1305.0</td>
          <td>3.14</td>
        </tr>
        <tr>
          <th>5799</th>
          <td>7.388164e-07</td>
          <td>1187.0</td>
          <td>2.92</td>
        </tr>
        <tr>
          <th>3513</th>
          <td>3.605119e-07</td>
          <td>1383.0</td>
          <td>3.28</td>
        </tr>
        <tr>
          <th>5790</th>
          <td>3.597305e-07</td>
          <td>1434.0</td>
          <td>3.64</td>
        </tr>
        <tr>
          <th>3199</th>
          <td>3.597306e-07</td>
          <td>1429.0</td>
          <td>3.54</td>
        </tr>
        <tr>
          <th>5909</th>
          <td>1.237991e+00</td>
          <td>1288.0</td>
          <td>3.39</td>
        </tr>
        <tr>
          <th>5719</th>
          <td>3.597305e-07</td>
          <td>1488.0</td>
          <td>3.87</td>
        </tr>
        <tr>
          <th>2688</th>
          <td>3.597305e-07</td>
          <td>1480.0</td>
          <td>3.80</td>
        </tr>
        <tr>
          <th>251</th>
          <td>3.597305e-07</td>
          <td>1512.0</td>
          <td>3.59</td>
        </tr>
        <tr>
          <th>5462</th>
          <td>1.123099e-01</td>
          <td>1218.0</td>
          <td>3.02</td>
        </tr>
        <tr>
          <th>3053</th>
          <td>3.597317e-07</td>
          <td>1428.0</td>
          <td>3.42</td>
        </tr>
        <tr>
          <th>2712</th>
          <td>6.878658e-01</td>
          <td>1248.0</td>
          <td>3.23</td>
        </tr>
        <tr>
          <th>3772</th>
          <td>9.655745e-01</td>
          <td>1299.0</td>
          <td>3.20</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 253-255

Copyright © 2023 Gurobi Optimization, LLC


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.178 seconds)


.. _sphx_glr_download_auto_examples_example2_student_admission.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: example2_student_admission.ipynb <example2_student_admission.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: example2_student_admission.py <example2_student_admission.py>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_