Credit Risk Model and Scorecard
Credit scoring technology has been an effective method of assessing loan applicants due to use of technology and application of statistical models. Credit providers such as banks have resorted to use of machine learning techniques to evaluate customer credit scorecard. The credit scorecard is used to evaluate data provided by the customer based on the historical credit data and other third-party platforms. Machine learning has allowed analysis of large amounts of data. It uses the data provided to provide high accuracy and reliable data (Christian, 2003).
This assignment combines data mining analysis, data processing, credit scorecard, variable WOE discretization and logic regression model to develop a risk-level control system for a banking institution. The data was collected from freddimac.com website. The dataset consisted of 10,000 samples and 11 variables. To process the data, Python programming was used. This report will go through all steps used to develop a data-driven credit risk model using python. The model will calculate the probability of an individual being a defaulter (probabilities of defaults (PD)) and assigning of credit scores to the potential or existing borrowers. The model will use an easy and interpretable scorecard that make credit score calculations easy.
Weight of Evidence (WoE) is a powerful method of performing variable transformation and selection. The technique connects with regression modelling concepts. WoE is an important method of measuring the separation of good vs bad borrowers during credit scoring. This model created new categorical variables based on WoE and numerical features. Before the development of credit risk model, specific python packages were used.
WoE and information value (IV) models were used for featuring select, and engineering is extensive to generate the credit scoring domain. The models were used to predict the power of a variable (independent variable) in relations to the target variable. The WoE measures the extent of a specific feature that differentiates between good and bad borrower between target classes. IV was implemented to assist the ranking of elements based on relative importance. It was used to enable one to consider contributions to the output.
The WOE is expected to detect the linear and non-linear relationships, visualize relationships, rank variables according to their predictive strength, handle missing values in the dataset without imputation. Missing values were separated category in WoE engineering. Lastly, the WoE will be used to predict the power of missing values.
The WoE was calculated using the formula;
WoE = ln(% of good customers/ % of bad customers)
A positive WoE in the model means that the proportion of good customers is higher than bad customers for the WoE value. The model was calculated using the following engineering features;
First, each unique value known as (bin) were calculated. The bin includes categorical variables such as grade: A, Grade: B, grade: C etc. Secondly, discret bins are continuous variables that base unique distribution and unique numbers using the pd.cut (the fine classing). The WoE is calculated for each bin that has been derived from a continuous variable. Lastly, the calculated WoE was calculated using numerical features and categorical features—the combination in the bins as per the coarse classing rules.
The WoE bin rules used in the analysis included; i) Each bin must contain at least 5% of the whole dataset’s combination, ii) All the bin should contain non-zero for both good and bad loans, iii) WoE is different for each category. In this, all groups with similar characteristics will be binned or aggregated together. The similarity of WoE with bins have the same proportion of either good or bad loans, which indicate that they have similar predictive power. iv) all missing values were binned together, v) the WoE is monotonic that is decreasing or increasing with the bins.
Discretization of numeric or inning features is mainly not recommended in machine learning since it leads to loss of data. However, the objective of this model was to create a scorecard that classifies new and untrained observations from the dataset used—considering that the bins are continuous variables that can be categorized for the corresponding coefficients or weight and the expected levels of borrowing. Thus, borrowers will be given the same scores in a particular category regardless of their incomes. If the WoE is different, the resulting types will be classified according to the potential new borrowers according to the income score. The information value of the other hand was calculated using the formula;
VI = ∑ (% of good customer – % of bad customer) x WoE
The final puzzle was creating a usable credit scorecard. The developed scorecard should be functional by any layperson to calculate the credit score of a borrower given required information about them and their credit history. Reference categories were appended to all reference categories left out from the model when the coefficient value is 0—adding on the other column for the original feature name such as grade representing the grade level (grade: A, grade: B etc.). The balanced scorecard will thus determine the minimum and maximum scores that the model will output. A similar range is given to be used by the FICO ranging from 300 to 850.
The model will implement a logistic regression model for all the categories scaled in the credit scores through a simple calculation. Additionally, an updated model that intercepts the credit scores and further scaling will be used to calculate the start of scaling points. Through this, the scorecard will resemble the Score-Preliminary in a simple rounding column of computed scores. The results can also be manually adjusted depending on given circumstances. The score for the random category ensures that the minimum and maximum possible scores are shown in the range of between 300 and 850. However, the model uses trial and errors.
A credit score for the experimental tests will be calculated from the final scorecard. In the test set training, simple collections of dummy variables of 1s and 0s that represent whether the observation belongs to a specific dummy variable or not. Each grade will have its own home and a verification status verified. Various adjustments will then be made to the test set by multiplying simple metrics that categorizes scorecards and grade category. Calculation automation will be developed to use the given dot multiplications.
The terms of AUC used in the valuations used two variables true positive rate and false-positive rates. The ROC curve represented the variable result. Another AUROC model was used to show the precision and recall rates in no skill and logistics variables on the PR curve (Khokhlova, 2018).
The relevance of the scorecard is that it allows the lender to determine which loan to be approved or rejected. The idea behind the calculation is to determine the cut-off point for issuing a loan to a potential borrower. To find the loan cut-off, the borrower has to find the probability thresholds from the ROC curve of between 0 and 1. The intension of setting the approval cut-off is to minimize the point between 0 and 1, the TPR maximization and FPR minimization.
Comparison between XGBoosting model and Forest Model in WoE Transformation
According to Bücker, Szepannek, & Biecek (2019), machine learning can be implemented using different methods. XGBoost is one example of machine learning methods launched in 2014. Many computer scientists recommend the technique for its accuracy levels. The algorithm features optimization in machine learning languages such as R and Python. XGBoost model is powerful due to characteristics such as scalability and its ability to run parallel and distributed computing memory usage. This means that the method is fast.
While decision trees are the most interpretable models, they mostly have erratic behaviour. In boosting, the trees are built subsequently in a way that each tree aims at reducing errors generated by the other tree. Each tree learns from the other hence a tree that next will learn from an updated version of residuals (Brownlee, 2020). Features of the XGBoosting model include gradient boosting, stochastic gradient boosting and regularized gradient boosting.
Forest Model, on the other hand, was developed as a competition for boosting. The model is usable when predicting continuous or categorical data. From the computational point of view, the method is recommended for its ability to naturally handle multiclass classifications, its procession speed when predicting or training data, its applicability in parallel implementation and its big dimensional problem.
The model works using tree assembles that depend on the collection of random variables. The models use p-dimensional random vector, represented by the formula x= (x1, …., XP)T where x represents the real-valued predictor or input variable while Y represents the real-valued response. Thus, the model can assume an unknown distribution joint Pyx (X, Y) (Cutler & Cutler, 2011).
Variable Importance in all Models
Development of an effective credit metrics model depends on the availability of historical data about credit rating and average default rates. The probability of transmission between risks and historical data relies on the ability of the risk model to use the right variables. The risk portfolio of a borrower is generated by looking at the credit qualities of the individual (Fang & Huang, 2011).
Credit Risk Model and Scorecard