With less than 3 days to go, this script is meant to help beginners with feisty ideas, machine learning workflow and motivation for ongoing machine learning challenge.
Here's a quick workflow of what I've done below:
- Load data and explore
- Data Pre-processing
- Dropped Features
- One Hot Encoding
- Feature Engineering
- Model Training
Note: For more feature engineering ideas, spend time on exploring data by loan_status variable. For categorical vs categorical data, create dodged bar plots. For categorical vs continuous data, create density plots and use fill=as.factor(loan_status).
To help the community, feel free to contribute the equivalent python / C ++ script in the comments below.
Update: You can get python script for this solution from Jin Cong Ho's comment below.