Visualizations with Display Objects

In this example, we will construct display objects, ConfusionMatrixDisplay, RocCurveDisplay, and PrecisionRecallDisplay directly from their respective metrics. This is an alternative to using their corresponding plot functions when a model’s predictions are already computed or expensive to compute. Note that this is advanced usage, and in general we recommend using their respective plot functions.

print(__doc__)

Load Data and train model

For this example, we load a blood transfusion service center data set from OpenML <https://www.openml.org/d/1464>. This is a binary classification problem where the target is whether an individual donated blood. Then the data is split into a train and test dataset and a logistic regression is fitted wtih the train dataset.

from sklearn.datasets import fetch_openml
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

X, y = fetch_openml(data_id=1464, return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)

clf = make_pipeline(StandardScaler(), LogisticRegression(random_state=0))
clf.fit(X_train, y_train)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/sphinx_gallery/gen_gallery.py", line 159, in call_memory
    return 0., func()
  File "/usr/lib/python3/dist-packages/sphinx_gallery/gen_rst.py", line 466, in __call__
    exec(self.code, self.fake_main.__dict__)
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/examples/miscellaneous/plot_display_object_visualization.py", line 32, in <module>
    X, y = fetch_openml(data_id=1464, return_X_y=True)
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 752, in fetch_openml
    data_description = _get_data_description_by_id(data_id, data_home)
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 401, in _get_data_description_by_id
    json_data = _get_json_content_from_openml_api(url, error_message, True,
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 161, in _get_json_content_from_openml_api
    return _load_json()
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 61, in wrapper
    return f(*args, **kw)
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 157, in _load_json
    with closing(_open_openml_url(url, data_home)) as response:
  File "/build/scikit-learn-2peENo/scikit-learn-0.23.2/.pybuild/cpython3_3.8/build/sklearn/datasets/_openml.py", line 106, in _open_openml_url
    with closing(urlopen(req)) as fsrc:
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1393, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1353, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

Create ConfusionMatrixDisplay

With the fitted model, we compute the predictions of the model on the test dataset. These predictions are used to compute the confustion matrix which is plotted with the ConfusionMatrixDisplay

from sklearn.metrics import confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay

y_pred = clf.predict(X_test)
cm = confusion_matrix(y_test, y_pred)

cm_display = ConfusionMatrixDisplay(cm).plot()

Create RocCurveDisplay

The roc curve requires either the probabilities or the non-thresholded decision values from the estimator. Since the logistic regression provides a decision function, we will use it to plot the roc curve:

from sklearn.metrics import roc_curve
from sklearn.metrics import RocCurveDisplay
y_score = clf.decision_function(X_test)

fpr, tpr, _ = roc_curve(y_test, y_score, pos_label=clf.classes_[1])
roc_display = RocCurveDisplay(fpr=fpr, tpr=tpr).plot()

Create PrecisionRecallDisplay

Similarly, the precision recall curve can be plotted using y_score from the prevision sections.

from sklearn.metrics import precision_recall_curve
from sklearn.metrics import PrecisionRecallDisplay

prec, recall, _ = precision_recall_curve(y_test, y_score,
                                         pos_label=clf.classes_[1])
pr_display = PrecisionRecallDisplay(precision=prec, recall=recall).plot()

Combining the display objects into a single plot

The display objects store the computed values that were passed as arguments. This allows for the visualizations to be easliy combined using matplotlib’s API. In the following example, we place the displays next to each other in a row.

import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 8))

roc_display.plot(ax=ax1)
pr_display.plot(ax=ax2)
plt.show()

Total running time of the script: ( 0 minutes 0.008 seconds)

Gallery generated by Sphinx-Gallery