ML for Targeted Mortgage Marketing

Cornell M.Eng. consulting-style machine learning project completed with Deluxe, focused on predicting the likelihood that a household would take out a home equity loan in order to improve targeted marketing and customer acquisition strategy.

Mortgage project screenshot

Project Snapshot

Overview

This project was designed to simulate a real-world data science consulting engagement. Working with a team of Cornell graduate students, I helped build an end-to-end modeling pipeline to predict whether a customer would respond to a mortgage-related offer. The broader business goal was to help improve marketing precision by identifying higher-probability households more effectively.

The dataset was high dimensional and messy, with extensive missing values, transformed financial variables, class imbalance, and outliers. Because of that, the project was not just about training a model — it was about building a thoughtful preprocessing and feature engineering workflow that could improve both performance and explainability.

What I Worked On

Technical Approach

In Phase I, we established baseline models with relatively simple preprocessing. This included dropping columns with more than 50% missingness, imputing remaining missing values, scaling variables, and using SMOTE to address severe class imbalance.

In Phase II, we built a more advanced feature engineering pipeline. That work included:

Results

One of the biggest project takeaways was that the enhanced feature engineering pipeline meaningfully improved the sparse logistic regression model. Test accuracy improved from 91.28% in Phase I to 94.95% in Phase II, while test AUROC improved from 0.9672 to 0.9826.

XGBoost performed strongly as a baseline model, reaching 98.77% test accuracy in Phase I. The project also surfaced which features were most influential, including home equity activity, mortgage history, and engineered indicators derived from transaction behavior.

Why It Matters

Beyond model accuracy, this project was a strong example of how data science can support business decision-making. The work connected data cleaning, feature engineering, modeling, and interpretation in a way that mapped directly to a real customer acquisition use case. It also reinforced an important lesson: in many applied ML problems, better preprocessing and feature design can matter just as much as model choice.

Key Takeaways