scikit-learn-contrib
diff --git a/‎doc/cross_validation.rst‎ ‎doc/model_selection.rst‎doc/cross_validation.rst renamed to doc/model_selection.rst
Lines changed: 40 additions & 31 deletions b/‎doc/cross_validation.rst‎ ‎doc/model_selection.rst‎doc/cross_validation.rst renamed to doc/model_selection.rst
Lines changed: 40 additions & 31 deletions
diff --git a/‎doc/references/cross_validation.rst‎
Lines changed: 0 additions & 23 deletions b/‎doc/references/cross_validation.rst‎
Lines changed: 0 additions & 23 deletions
diff --git a/‎doc/references/index.rst‎
Lines changed: 1 addition & 1 deletion b/‎doc/references/index.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/references/model_selection.rst‎
Lines changed: 23 additions & 0 deletions b/‎doc/references/model_selection.rst‎
Lines changed: 23 additions & 0 deletions
diff --git a/‎doc/user_guide.rst‎
Lines changed: 1 addition & 1 deletion b/‎doc/user_guide.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/cross_validation/README.txt‎
Lines changed: 0 additions & 6 deletions b/‎examples/cross_validation/README.txt‎
Lines changed: 0 additions & 6 deletions
diff --git a/‎examples/cross_validation/plot_instance_hardness_cv.py‎
Lines changed: 0 additions & 82 deletions b/‎examples/cross_validation/plot_instance_hardness_cv.py‎
Lines changed: 0 additions & 82 deletions
diff --git a/‎examples/model_selection/plot_instance_hardness_cv.py‎
Lines changed: 97 additions & 0 deletions b/‎examples/model_selection/plot_instance_hardness_cv.py‎
Lines changed: 97 additions & 0 deletions
@@ -4,24 +4,25 @@
 Cross validation
 ================
 
-.. currentmodule:: imblearn.cross_validation
+.. currentmodule:: imblearn.model_selection
 
 
-.. _instance_hardness_threshold:
+.. _instance_hardness_threshold_cv:
 
-The term instance hardness is used in literature to express the difficulty to
-correctly classify an instance. An instance for which the predicted probability
-of the true class is low, has large instance hardness. The way these
-hard-to-classify instances are distributed over train and test sets in cross
-validation, has significant effect on the test set performance metrics. The
-`InstanceHardnessCV` splitter distributes samples with large instance hardness
-equally over the folds, resulting in more robust cross validation.
+The term instance hardness is used in literature to express the difficulty to correctly
+classify an instance. An instance for which the predicted probability of the true class
+is low, has large instance hardness. The way these hard-to-classify instances are
+distributed over train and test sets in cross validation, has significant effect on the
+test set performance metrics. The :class:`~imblearn.model_selection.InstanceHardnessCV`
+splitter distributes samples with large instance hardness equally over the folds,
+resulting in more robust cross validation.
 
 We will discuss instance hardness in this document and explain how to use the
-`InstanceHardnessCV` splitter.
+:class:`~imblearn.model_selection.InstanceHardnessCV` splitter.
 
 Instance hardness and average precision
 =======================================
+
 Instance hardness is defined as 1 minus the probability of the most probable class:
 
 .. math::
@@ -32,7 +33,7 @@ In this equation :math:`H(x)` is the instance hardness for a sample with feature
 :math:`x` and :math:`P(\hat{y}|x)` the probability of predicted label :math:`\hat{y}`
 given the features. If the model predicts label 0 and gives a `predict_proba` output
 of [0.9, 0.1], the probability of the most probable class (0) is 0.9 and the
-instance hardness is 1-0.9=0.1.
+instance hardness is `1-0.9=0.1`.
 
 Samples with large instance hardness have significant effect on the area under
 precision-recall curve, or average precision. Especially samples with label 0
@@ -42,7 +43,7 @@ where the area is largest; the precision is lowered in the range of low recall
 and high thresholds. When doing cross validation, e.g. in case of hyperparameter
 tuning or recursive feature elimination, random gathering of these points in
 some folds introduce variance in CV results that deteriorates robustness of the
-cross validation task. The `InstanceHardnessCV`
+cross validation task. The :class:`~imblearn.model_selection.InstanceHardnessCV`
 splitter aims to distribute the samples with large instance hardness over the
 folds in order to reduce undesired variance. Note that one should use this
 splitter to make model *selection* tasks robust like hyperparameter tuning and
@@ -53,8 +54,8 @@ want to know the variance of performance to be expected in production.
 Create imbalanced dataset with samples with large instance hardness
 ===================================================================
 
-Let’s start by creating a dataset to work with. We create a dataset with 5% class
-imbalance using scikit-learn’s `make_blobs` function.
+Let's start by creating a dataset to work with. We create a dataset with 5% class
+imbalance using scikit-learn's :func:`~sklearn.datasets.make_blobs` function.
 
   >>> import numpy as np
   >>> from matplotlib import pyplot as plt
@@ -66,8 +67,8 @@ imbalance using scikit-learn’s `make_blobs` function.
   >>> plt.scatter(X[:, 0], X[:, 1], c=y)
   >>> plt.show()
 
-.. image:: ./auto_examples/cross_validation/images/sphx_glr_plot_instance_hardness_cv_001.png
-   :target: ./auto_examples/cross_validation/plot_instance_hardness_cv.html
+.. image:: ./auto_examples/model_selection/images/sphx_glr_plot_instance_hardness_cv_001.png
+   :target: ./auto_examples/model_selection/plot_instance_hardness_cv.html
    :align: center
 
 Now we add some samples with large instance hardness
@@ -80,40 +81,48 @@ Now we add some samples with large instance hardness
   >>> plt.scatter(X[:, 0], X[:, 1], c=y)
   >>> plt.show()
 
-.. image:: ./auto_examples/cross_validation/images/sphx_glr_plot_instance_hardness_cv_002.png
-   :target: ./auto_examples/cross_validation/plot_instance_hardness_cv.html
+.. image:: ./auto_examples/model_selection/images/sphx_glr_plot_instance_hardness_cv_002.png
+   :target: ./auto_examples/model_selection/plot_instance_hardness_cv.html
    :align: center
 
-Assess cross validation performance variance using InstanceHardnessCV splitter
-==============================================================================
+Assess cross validation performance variance using `InstanceHardnessCV` splitter
+================================================================================
 
-Then we take a `LogisticRegressionClassifier` and assess the cross validation
-performance using a `StratifiedKFold` cv splitter and the `cross_validate`
-function.
+Then we take a :class:`~sklearn.linear_model.LogisticRegression` and assess the
+cross validation performance using a :class:`~sklearn.model_selection.StratifiedKFold`
+cv splitter and the :func:`~sklearn.model_selection.cross_validate` function.
 
   >>> from sklearn.ensemble import LogisticRegressionClassifier
   >>> clf = LogisticRegressionClassifier(random_state=random_state)
   >>> skf_cv = StratifiedKFold(n_splits=5, shuffle=True,
   ...                           random_state=random_state)
   >>> skf_result = cross_validate(clf, X, y, cv=skf_cv, scoring="average_precision")
 
-Now, we do the same using an `InstanceHardnessCV` splitter. We use provide our
-classifier to the splitter to calculate instance hardness and distribute samples
-with large instance hardness equally over the folds.
+Now, we do the same using an :class:`~imblearn.model_selection.InstanceHardnessCV`
+splitter. We use provide our classifier to the splitter to calculate instance hardness
+and distribute samples with large instance hardness equally over the folds.
 
   >>> ih_cv = InstanceHardnessCV(estimator=clf, n_splits=5,
   ...                               random_state=random_state)
   >>> ih_result = cross_validate(clf, X, y, cv=ih_cv, scoring="average_precision")
 
-When we plot the test scores for both cv splitters, we see that the variance using
-the `InstanceHardnessCV` splitter is lower than for the `StratifiedKFold` splitter.
+When we plot the test scores for both cv splitters, we see that the variance using the
+:class:`~imblearn.model_selection.InstanceHardnessCV` splitter is lower than for the
+:class:`~sklearn.model_selection.StratifiedKFold` splitter.
 
   >>> plt.boxplot([skf_result['test_score'], ih_result['test_score']],
   ...               tick_labels=["StratifiedKFold", "InstanceHardnessCV"],
   ...               vert=False)
   >>> plt.xlabel('Average precision')
   >>> plt.tight_layout()
 
-.. image:: ./auto_examples/cross_validation/images/sphx_glr_plot_instance_hardness_cv_003.png
-   :target: ./auto_examples/cross_validation/plot_instance_hardness_cv.html
-   :align: center
+.. image:: ./auto_examples/model_selection/images/sphx_glr_plot_instance_hardness_cv_003.png
+   :target: ./auto_examples/model_selection/plot_instance_hardness_cv.html
+   :align: center
+
+Be aware that the most important part of cross-validation splitters is to simulate the
+conditions that one will encounter in production. Therefore, if it is likely to get
+difficult samples in production, one should use a cross-validation splitter that
+emulates this situation. In our case, the
+:class:`~sklearn.model_selection.StratifiedKFold` splitter did not allow to distribute
+the difficult samples over the folds and thus it was likely a problem for our use case.
@@ -18,6 +18,6 @@ This is the full API documentation of the `imbalanced-learn` toolbox.
    miscellaneous
    pipeline
    metrics
-   cross_validation
+   model_selection
    datasets
    utils
@@ -0,0 +1,23 @@
+.. _model_selection_ref:
+
+Model selection methods
+=======================
+
+.. automodule:: imblearn.model_selection
+    :no-members:
+    :no-inherited-members:
+
+Cross-validation splitters
+--------------------------
+
+.. automodule:: imblearn.model_selection._split
+   :no-members:
+   :no-inherited-members:
+
+.. currentmodule:: imblearn.model_selection
+
+.. autosummary::
+   :toctree: generated/
+   :template: class.rst
+
+   InstanceHardnessCV
@@ -19,7 +19,7 @@ User Guide
    ensemble.rst
    miscellaneous.rst
    metrics.rst
-   cross_validation.rst
+   model_selection.rst
    common_pitfalls.rst
    Dataset loading utilities <datasets/index.rst>
    developers_utils.rst
 
@@ -0,0 +1,97 @@
+"""
+====================================================
+Distribute hard-to-classify datapoints over CV folds
+====================================================
+
+'Instance hardness' refers to the difficulty to classify an instance. The way
+hard-to-classify instances are distributed over train and test sets has
+significant effect on the test set performance metrics. In this example we
+show how to deal with this problem. We are making the comparison with normal
+:class:`~sklearn.model_selection.StratifiedKFold` cross-validation splitter.
+"""
+
+# Authors: Frits Hermans, https://fritshermans.github.io
+# License: MIT
+
+# %%
+print(__doc__)
+
+# %%
+# Create an imbalanced dataset with instance hardness
+# ---------------------------------------------------
+#
+# We create an imbalanced dataset with using scikit-learn's
+# :func:`~sklearn.datasets.make_blobs` function and set the class imbalance ratio to
+# 5%.
+import numpy as np
+from matplotlib import pyplot as plt
+from sklearn.datasets import make_blobs
+
+X, y = make_blobs(n_samples=[950, 50], centers=((-3, 0), (3, 0)), random_state=10)
+plt.scatter(X[:, 0], X[:, 1], c=y)
+
+# %%
+# To introduce instance hardness in our dataset, we add some hard to classify samples:
+X_hard, y_hard = make_blobs(
+    n_samples=10, centers=((3, 0), (-3, 0)), cluster_std=1, random_state=10
+)
+X, y = np.vstack((X, X_hard)), np.hstack((y, y_hard))
+plt.scatter(X[:, 0], X[:, 1], c=y)
+
+# %%
+# Compare cross validation scores using `StratifiedKFold` and `InstanceHardnessCV`
+# --------------------------------------------------------------------------------
+#
+# Now, we want to assess a linear predictive model. Therefore, we should use
+# cross-validation. The most important concept with cross-validation is to create
+# training and test splits that are representative of the the data in production to have
+# statistical results that one can expect in production.
+#
+# By applying a standard :class:`~sklearn.model_selection.StratifiedKFold`
+# cross-validation splitter, we do not control in which fold the hard-to-classify
+# samples will be.
+#
+# The :class:`~imblearn.model_selection.InstanceHardnessCV` splitter allows to
+# control the distribution of the hard-to-classify samples over the folds.
+#
+# Let's make an experiment to compare the results that we get with both splitters.
+# We use a :class:`~sklearn.linear_model.LogisticRegression` classifier and
+# :func:`~sklearn.model_selection.cross_validate` to calculate the cross validation
+# scores. We use average precision for scoring.
+import pandas as pd
+from sklearn.linear_model import LogisticRegression
+from sklearn.model_selection import StratifiedKFold, cross_validate
+
+from imblearn.model_selection import InstanceHardnessCV
+
+logistic_regression = LogisticRegression()
+
+results = {}
+for cv in (
+    StratifiedKFold(n_splits=5, shuffle=True, random_state=10),
+    InstanceHardnessCV(estimator=LogisticRegression(), n_splits=5, random_state=10),
+):
+    result = cross_validate(
+        logistic_regression,
+        X,
+        y,
+        cv=cv,
+        scoring="average_precision",
+    )
+    results[cv.__class__.__name__] = result["test_score"]
+results = pd.DataFrame(results)
+
+# %%
+ax = results.plot.box(vert=False, whis=[0, 100])
+ax.set(
+    xlabel="Average precision",
+    title="Cross validation scores with different splitters",
+    xlim=(0, 1),
+)
+
+# %%
+# The boxplot shows that the :class:`~imblearn.model_selection.InstanceHardnessCV`
+# splitter results in less variation of average precision than
+# :class:`~sklearn.model_selection.StratifiedKFold` splitter. When doing
+# hyperparameter tuning or feature selection using a wrapper method (like
+# :class:`~sklearn.feature_selection.RFECV`) this will give more stable results.