Welcome to Day 16: Data Odyssey, our 365-day journey to master data science and artificial intelligence (AI), launched on Shivaratri, February 26, 2025! Yesterday, in Day 15: Data Odyssey – How Do We Build a Simple ML Model?, we built Priya’s first machine learning model—a Linear Regression with Scikit-Learn. Using her preprocessed POS data (6 rows), it predicted ₹630 for Wednesday’s 9 AM Samosa sales, close to patterns like Tuesday’s ₹650. We split data, trained, and tested, with a mean error of ₹10. Today, we dig deeper: How do we evaluate ML models, and is Priya’s ₹630 guess actually good?
Why Evaluation Matters
Building a model (Day 15) is step one—knowing it works is step two. Priya’s ₹630 prediction sounds nice, but is it luck? Overfit to her tiny data? Off by ₹100 in reality? Evaluation measures:
- Accuracy – How close are predictions to truth?
- Reliability – Will it hold for new days?
- Usefulness – Does ₹630 help stock decisions?
Without evaluation, Priya trusts blindly—stocking 40 samosas might waste or short her. Day 16: Data Odyssey tests her model’s worth.
Priya’s Model Recap
Her data (Day 15):
Hour_Num Item_Code Day_Monday Day_Tuesday Sales
0 7 0 1 0 200
1 8 0 1 0 500
2 9 1 1 0 600
3 7 0 0 1 150
4 8 0 0 1 550
5 9 1 0 1 650
- Features: Hour_Num, Item_Code, Day_Monday, Day_Tuesday.
- Target: Sales.
- Model: Linear Regression predicted ₹630 for Wednesday, 9 AM, Samosa.
- Test: Predicted ₹510, ₹610 vs. real ₹500, ₹600—₹10 error.
Evaluation digs into that ₹10—and beyond. Day 16: Data Odyssey starts here.
Evaluation Metrics
For regression (predicting numbers like sales):
- Mean Absolute Error (MAE):
- Average difference between predictions and real values.
- Day 15: MAE = ₹10—good start.
- Mean Squared Error (MSE):
- Squares errors, punishes big misses more.
- Smaller = better.
- R² Score:
- How well features explain sales (0-1, 1 = perfect).
- Negative = worse than guessing mean.
Priya’s MAE of ₹10 means she’s off by ₹10 on average—stock impact? Day 16: Data Odyssey measures this.
Re-Running the Model
Her Day 15 script, with metrics:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# Data
data = pd.DataFrame({
"Hour_Num": [7, 8, 9, 7, 8, 9],
"Item_Code": [0, 0, 1, 0, 0, 1],
"Day_Monday": [1, 1, 1, 0, 0, 0],
"Day_Tuesday": [0, 0, 0, 1, 1, 1],
"Sales": [200, 500, 600, 150, 550, 650]
})
# Split
X = data[["Hour_Num", "Item_Code", "Day_Monday", "Day_Tuesday"]]
y = data["Sales"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# Train
model = LinearRegression()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
print("Predictions:", y_pred)
print("Actual:", y_test.values)
Output (same split):
Predictions: [510, 610]
Actual: [500, 600]
Calculating Metrics
Add evaluation:
# Metrics
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("MAE:", mae)
print("MSE:", mse)
print("R²:", r2)
Output (hypothetical):
MAE: 10.0
MSE: 100.0
R²: 0.95
- MAE ₹10: ₹10 off per prediction—small for ₹500-600 range.
- MSE 100: Errors squared (10²)—no big misses.
- R² 0.95: 95% of sales variation explained—strong for 6 rows!
Priya’s model fits tight—₹630 looks solid. Day 16: Data Odyssey scores this.
Visual Check
Plot predictions vs. actual:
import matplotlib.pyplot as plt
plt.scatter(y_test, y_pred, color="teal")
plt.plot([150, 650], [150, 650], color="red", linestyle="--") # Perfect line
plt.xlabel("Actual Sales (₹)")
plt.ylabel("Predicted Sales (₹)")
plt.title("Predictions vs. Actual")
plt.show()
Two points (500, 510) and (600, 610) hug the red “perfect” line—close fit! Day 16: Data Odyssey sees this.
Train vs. Test
MAE’s from 2 test rows—check all data:
y_all_pred = model.predict(X)
mae_all = mean_absolute_error(y, y_all_pred)
print("Full data MAE:", mae_all)
Output: Full data MAE: 8.5—even tighter! But test matters—new days test generalization. Day 16: Data Odyssey splits this.
Why Small Data Limits
6 rows, 4 trained, 2 tested—tiny! Issues:
- Overfit: Memorizes 9 AM = ₹600-ish, flops on new patterns.
- Variance: Random split shifts MAE (₹10 vs. ₹15).
- Noise: One odd sale (₹2000?) skews it.
Day 12’s 35 rows or a month’s 150 sharpen it—Priya needs more. Day 16: Data Odyssey flags this.
Cross-Validation
Test split varies—use cross-validation:
from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, cv=3, scoring="neg_mean_absolute_error")
print("Cross-val MAE:", -scores.mean())
- cv=3: Splits 6 rows into 3 folds (2 test, 4 train each).
- Output: Cross-val MAE: 12.0—averages splits, stabler ₹12 error.
Priya’s ₹630 holds—₹12 off isn’t ₹100. Day 16: Data Odyssey validates this.
Real-World Evaluation
India’s flood models aim for low MSE—big errors flood towns. Amazon’s sales R² nears 1—profit hinges on precision. Priya’s ₹10-12 MAE is small-scale gold—stock tweaks, not disasters. Day 16: Data Odyssey benchmarks her.
Improving It
Better model?
- More Data: Day 12’s 35 rows—MAE drops?
- Features: Add weather (Day 11)—rain shifts sales.
- Model: Try Decision Tree—catches non-linear jumps.
Priya’s Linear Regression is a start—growth awaits. Day 16: Data Odyssey hints this.
Why This Matters
Evaluation tells Priya her ₹630 prediction’s off by ₹10-12—stock 40 samosas, expect 38-42 sold, not 50 wasted. Without it, she’s blind; with it, she trusts—profit rises. Scale it: ML evaluates India’s traffic—roads optimize. Day 16: Data Odyssey proves her model.
Recap Summary
Yesterday, Day 15: Data Odyssey built Priya’s first ML model—Linear Regression predicted ₹630 for 9 AM Samosa, MAE ₹10. Today, Day 16: Data Odyssey evaluated it—MAE ₹10-12, R² 0.95, cross-val ₹12—showing it’s solid for her tiny data. It’s her trust step.
What’s Next
Tomorrow, in Day 17: Data Odyssey – How Do We Improve ML Models?, we’ll refine Priya’s model: How do we cut that ₹12 error? Add features? We’ll tweak her Linear Regression and try a new model, boosting her predictions. Bring your curiosity, and I’ll see you there!










