Most ML algorithms require the ability to explain the decision taken by a model easily. For example, in banks, when a person requests a bank loan. The bank should be able to explain why a particular decision (rejection or acceptance) is made. The goal of the interpretaility is to bring trust, privacy, fairness and robustness to the Machine Learning models in comparison with decision of black-box models that are hard to comprehend.
Given a training data set
- Prototypes should cover as many training samples of the same class
$l$ - Prototypes should cover few training samples from another class and,
- The number of prototypes needed to cover data samples of a particular class should be as few as possible (also called sparsity).
The prototypes selected in PS are actual data points as they will add more interpretable meaning to the model. PS scheme intially is formed using Set Cover Integer problem. For a given radius of an epsilon ball (centered at chosen prototype) PS outputs minimum number of balls required to form a Cover while preserving the properties of prototypes. Then, is tranformed into l-prize collection problem and solving using two approach namely
- Greedy Approach (recommended for large dataset)
- Randommized Rounding Algorithm
The figure above shows visualisation of PS scheme for synthetic data (sklean moon data) for the chosen value of epsilon. A filled circle represents data points, and covers are represented by an unfilled circle centered at a data point (denoted by X) chosen as a prototype.
For further reading and mathematical understanding please refer J. Bien and R. Tibshirani.
In order to understand the GLVQ, the prototype set has been considered as
where,
For further reading and understanding please refer A. Sato and K. Yamada
Image below shows the selected prototype (X) to represent the data samples (filled o).
- Unlike k-Nearest Neighbor storing whole data for prediction, in prototype selection scheme the condensed form of training data samples (prototypes) are only require to be stored saving large amount of memory.
- For prediction it only utlises the distances to the selected prototypes (saving time required to compare whole data sample)
Comparison of GLVQ and GLVQ-PS |