I am working on the problem of loan application acceptance/rejection. I have historical data of about 500K applications and about 70K loans that got funded out of these applications for various loan products and their performance histories. I want to build a predictive model based on this to evaluate and accept/reject future loan applications, and if accepted, which loan products to offer to the borrower.
There are various loan products that could be offered to the borrower depending on the borrower's credit ratings and other borrower metrics. The loan products come with fixed loan amounts, interest rates, loan terms and origination fees. For example, the loan products could look like this:
term_in_months,amount,interest_rate,origination_fee
48,45000,13,750
48,45000,19,750
60,45000,18,900
36,25000,23,275
48,25000,28,500
24,10000,35,100
When a loan application comes in, we need to see if it qualifies for any of our loan products. If you look at the example above, the first and the second products have the same terms, amounts and origination fees, but the first one has a lower interest rate, so higher borrower ratings would be required for the first product.
My first question is how to translate these four (outcome) variables into a single variable so that the different loan products can be ordered. This would be a way of measuring the quality of the loan product.
Also I am reading Siddiqi's Credit Risk Scorecards, which seems to be written for managers and not developers/modelers. Can someone suggest better references or how to approach this problem from a practitioner's perspective? How does a company like, say Lending Club, solve this problem?
PS: I have a decent background in statistics, machine learning and R and several years of programming experience, but have difficulty following theoretical Math. For example, I could easily follow Hastie et al's "Introduction to Statistical Learning" and Kuhn's "Applied Predictive Modeling", but not Hastie et al's "Elements of Statistical Learning" or Bishop's "Pattern Recognition and Machine Learning".