
A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011
, BorderlineSMOTE 2005 and ADASYN 2008 . Or by decreasing the number of majority cases using NearMiss 2003 or Tomek link removal 1976 .