Home Resume Projects About Contact
a collection of work I'm proud of

Projects &
Stories

Real data, real problems, real people. Every project here started with a question I actually wanted answered.

three projects · end-to-end · built from scratch
01
Featured Project
How Moms
Use AI
Behavioral Segmentation · NLP · Persona Classification

AI tools are everywhere, but not everyone is using them the same way — or at all. I wanted to understand how moms specifically engage with AI in their real lives. So I went and asked them. Fifty-two real survey responses later, I built an end-to-end machine learning pipeline that turns messy, unstructured text into a behavior-based persona framework that can actually predict who someone is.

This one is personal. As the founder of Melanated Mamas Golden Crescent, I know firsthand that the tools we build should reflect the people using them. This project is my attempt to do that with data.

Python scikit-learn TF-IDF KMeans Random Forest NLP Pandas Seaborn
View on GitHub
Model Accuracy
55%
4x better than the 12.5% random chance baseline for 8 classes
Strongest Persona · F1 Score
0.86
The Routine Queen — precision 75%, recall 100%
Real Responses Collected
52
Original data gathered directly from mothers
8 Personas Identified
CEO MomStrategic power user
Routine QueenStructure & systems
Study & SurviveHomework support
Conscious CreatorEthics-aware
Organized ChaosMental load
AI CuriousGrowing confidence
Curious but StuckNeeds onboarding
Not for MeSkeptical
02
Classification Model
Heart Disease
Prediction
Logistic Regression · KNN · Model Comparison · ROC-AUC

Healthcare data holds patterns that can save lives. Using 303 anonymous patient records and 13 clinical features — from cholesterol to chest pain type — I built and compared two classification models to predict heart disease risk. Logistic Regression came out on top, hitting 90% accuracy and a 0.94 ROC-AUC score. The real story is in why: heart disease risk follows predictable, linear patterns that logistic regression is built for.

One limitation I noted: no lifestyle data was available. Factors like diet, smoking, and activity level would paint a fuller picture — a real direction for future work.

Python Logistic Regression KNN scikit-learn Pandas NumPy Seaborn Matplotlib
View on GitHub
Logistic Regression Accuracy
90%
Outperformed KNN (85%) — best fit for linear clinical patterns
ROC-AUC Score
0.94
Strong separation between positive and negative cases
Patient Records
303
13 clinical features · 80/20 train-test split · 5-fold cross-validation
  • Scaled numeric features and one-hot encoded categorical variables
  • Split data 80/20 and validated with 5-fold cross-validation
  • Built and compared Logistic Regression and KNN models
  • Evaluated with accuracy, precision, recall, and ROC-AUC — LR won across all four
and there's more on the way

Let's Connect

I'm always building something new. If you want to talk data, career, or community — I'd love to hear from you.