Six different techniques are employed to train and evaluate models with unbalanced classes. Algorithms are used to predict credit risk. Performance of these different models is compared and recommendations are suggested based on results.
Credit risk is an unbalanced classification problem, as the number of good loans easily outnumber the number of risky loans. Use imbalanced-learn and scikit-learn libraries to build and evaluate machine learning models using resampling.
Data analysts were asked to examine credit card data from peer-to-peer lending services company LendingClub in order to determine credit risk. Supervised machine learning was employed to find out which model would perform the best against an unbalanced dataset. Data analysts trained and evaluated several models to predict credit risk.
[1] Outlines an oversampling technique using Optimal Transport. Seems interesting enough to try out.
References
[1] https://www.aaai.org/ojs/index.php/AAAI/article/view/4503/4381