100 MINS intermediate
9. Machine Learning Core
Module 09: ML Core
Supervised vs Unsupervised, Training, and Evaluation
Machine learning is the practice of building systems that learn patterns from data to make predictions or find structure. The gap between a model that works in a notebook and one that works in production is almost entirely about evaluation β choosing the right metrics, avoiding data leakage, understanding generalization, and communicating uncertainty. This module covers the foundational ML framework used across every algorithm and application.
πΊοΈ The ML Algorithm Map
The right algorithm choice depends on: the type of target variable, the size and dimensionality of your data, interpretability requirements, and the prediction task.
- Supervised Learning: You have labeled data (X β y). Regression (predict a number), Classification (predict a category).
- Unsupervised Learning: No labels. Clustering (find natural groups), Dimensionality Reduction (compress features), Anomaly Detection (find outliers).
- Semi-Supervised: A small amount of labeled data + a large amount of unlabeled data. LLM pretraining is a form of self-supervised learning.
- Reinforcement Learning: Agent learns optimal actions through rewards. Not covered in this track but used in recommendation systems and game AI.
βοΈ The Training Pipeline
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, cross_val_score, StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, roc_auc_score
import warnings
warnings.filterwarnings('ignore')
# Load and split data
df = pd.read_csv('churn_dataset.csv')
X = df.drop('churned', axis=1)
y = df['churned']
# CRITICAL: Split FIRST, then fit any transformers on training data only
# This prevents data leakage from test set statistics into the model
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42,
stratify=y # maintain class distribution in both sets
)
print(f'Train: {X_train.shape}, Test: {X_test.shape}')
print(f'Train churn rate: {y_train.mean():.3f}, Test: {y_test.mean():.3f}')
# Use Pipeline to prevent leakage β transformer fits on training data only
model_pipeline = Pipeline([
('scaler', StandardScaler()),
('model', RandomForestClassifier(n_estimators=200, random_state=42, n_jobs=-1))
])
# Cross-validation for reliable performance estimate
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
cv_scores = cross_val_score(model_pipeline, X_train, y_train,
cv=cv, scoring='roc_auc', n_jobs=-1)
print(f'\n5-Fold CV ROC-AUC: {cv_scores.mean():.4f} Β± {cv_scores.std():.4f}')
# Final fit and evaluation
model_pipeline.fit(X_train, y_train)
y_pred = model_pipeline.predict(X_test)
y_prob = model_pipeline.predict_proba(X_test)[:, 1]
print('\n=== TEST SET EVALUATION ===')
print(classification_report(y_test, y_pred, target_names=['Retained', 'Churned']))
print(f'ROC-AUC: {roc_auc_score(y_test, y_prob):.4f}')π The Evaluation Metrics Playbook
from sklearn.metrics import (
accuracy_score, precision_score, recall_score, f1_score,
roc_auc_score, average_precision_score, confusion_matrix,
mean_squared_error, mean_absolute_error, r2_score
)
import numpy as np
# Classification Metrics
def evaluate_classifier(y_true, y_pred, y_prob, positive_label='Churn'):
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
print('=== CLASSIFICATION EVALUATION ===')
print(f'Accuracy: {accuracy_score(y_true, y_pred):.4f}')
print(f'Precision: {precision_score(y_true, y_pred):.4f} (of predicted positives, how many are correct?)')
print(f'Recall/Sensitivity:{recall_score(y_true, y_pred):.4f} (of actual positives, how many were found?)')
print(f'Specificity: {tn/(tn+fp):.4f} (of actual negatives, how many were correctly identified?)')
print(f'F1 Score: {f1_score(y_true, y_pred):.4f} (harmonic mean of precision and recall)')
print(f'ROC-AUC: {roc_auc_score(y_true, y_prob):.4f} (ranking quality, threshold-independent)')
print(f'PR-AUC: {average_precision_score(y_true, y_prob):.4f} (better for imbalanced data)')
print(f'\nConfusion Matrix:')
print(f' True Positives: {tp} | False Positives: {fp}')
print(f' False Negatives: {fn} | True Negatives: {tn}')
# Regression Metrics
def evaluate_regressor(y_true, y_pred):
mse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
print('=== REGRESSION EVALUATION ===')
print(f'RMSE: {np.sqrt(mse):.4f} (in same units as target, penalizes large errors)')
print(f'MAE: {mae:.4f} (in same units, more robust to outliers than RMSE)')
print(f'MAPE: {np.mean(np.abs((y_true - y_pred) / y_true)) * 100:.2f}% (percentage error)')
print(f'RΒ²: {r2:.4f} (proportion of variance explained by model)')Data Science: Model Training Arena
Epochs
0
Mean Squared Error (Loss)
---
Independent Variable (X)
Target Variable (Y)
train_model.py
X_train, y_train loaded
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
TRAINING OUTPUT
[09:45:38][SYSTEM] Environment ready. Waiting for model training initialization...
Knowledge Check
Ready to test your understanding of 9. Machine Learning Core?