Welcome to Day 24: Data Odyssey, our 365-day journey to master data science and artificial intelligence (AI), launched on Shivaratri, February 26, 2025! Yesterday, in Day 23: Data Odyssey – How Do We Deploy ML Models?, we brought Priya’s Random Forest models to life. Saved as .pkl files, her regression (MAE ₹4) predicted ₹642 for Thursday’s 9 AM Samosa sales, and her classifier (95% cross-val) flagged “Busy”—all via a live script using her 7-row dataset. Today, we shift to time: What is time series analysis, and how can Priya uncover trends in her growing sales data?
The Rhythm of Time
Time series analysis studies data ordered by time—Priya’s sales at 7 AM, 8 AM, 9 AM across days. Unlike her Random Forest (Day 22), which treated rows independently, time series respects sequence: Monday’s ₹600 at 9 AM leads to Tuesday’s ₹650. It’s an “analyze” tool in our workflow (Day 1), spotting patterns—daily peaks, weekly cycles—for better forecasts.
Think of it as Priya tracking her café’s pulse. Sales aren’t random; they flow—8-9 AM surges, rainy days dip. Day 24: Data Odyssey taps this rhythm.
Why Time Series Matters
Priya’s deployed model predicts ₹642 for 9 AM—but why ₹642? Time series answers:
- Trends: Are sales rising weekly?
- Seasonality: Does 9 AM peak daily?
- Forecasts: What’s next week’s 9 AM?
Her 7 rows hint at 8-9 AM rushes (Day 6’s EDA), but time series scales this to weeks, months—stock smarter, not just today. Day 24: Data Odyssey reveals this.
Priya’s Data as Time Series
Her 7 rows (Day 23), ordered:
Date Hour_Num Item_Code Sales
0 Monday 7 0 200
1 Monday 8 0 500
2 Monday 9 1 600
3 Tuesday 7 0 150
4 Tuesday 8 0 550
5 Tuesday 9 1 650
6 Wednesday 9 1 640
- Time: Date + Hour_Num.
- Value: Sales.
- Gaps: Missing 10-11 AM—assume 0 or fill later.
Goal: Analyze trends—9 AM’s rise? Day 24: Data Odyssey starts here.
Time Series Basics
Key parts:
- Trend: Long-term rise/fall—sales growing?
- Seasonality: Daily cycles—8-9 AM peaks?
- Noise: Random dips—Tuesday’s 7 AM ₹150.
Pandas handles this—time as index. Day 24: Data Odyssey structures it.
Setting Up
Convert to time series:
import pandas as pd
# Data with datetime
data = pd.DataFrame({
"Date": ["2025-03-03", "2025-03-03", "2025-03-03", "2025-03-04", "2025-03-04", "2025-03-04", "2025-03-05"],
"Hour_Num": [7, 8, 9, 7, 8, 9, 9],
"Item_Code": [0, 0, 1, 0, 0, 1, 1],
"Sales": [200, 500, 600, 150, 550, 650, 640]
})
data["Datetime"] = pd.to_datetime(data["Date"]) + pd.to_timedelta(data["Hour_Num"], unit="h")
data.set_index("Datetime", inplace=True)
print(data[["Sales"]])
Output:
Sales
2025-03-03 07:00:00 200
2025-03-03 08:00:00 500
2025-03-03 09:00:00 600
2025-03-04 07:00:00 150
2025-03-04 08:00:00 550
2025-03-04 09:00:00 650
2025-03-05 09:00:00 640
Time-indexed—ready! Day 24: Data Odyssey times this.
Visualizing Trends
Plot it:
import matplotlib.pyplot as plt
data["Sales"].plot(figsize=(10, 6), marker="o", color="teal")
plt.title("Priya’s Sales Over Time")
plt.xlabel("Date and Hour")
plt.ylabel("Sales (₹)")
plt.grid(True)
plt.show()
- Pattern: 7 AM low (₹150-200), 8-9 AM high (₹500-650).
- Trend: Slight 9 AM rise—₹600 to ₹650.
- Gaps: Missing hours—sparse.
8-9 AM jumps daily—seasonality? Day 24: Data Odyssey sees this.
Daily Aggregation
Sum by day:
daily = data["Sales"].resample("D").sum()
print(daily)
daily.plot(marker="o", color="teal")
plt.title("Daily Total Sales")
plt.xlabel("Date")
plt.ylabel("Sales (₹)")
plt.grid(True)
plt.show()
Output:
2025-03-03 1300
2025-03-04 1350
2025-03-05 640
- Monday: ₹1300.
- Tuesday: ₹1350—up!
- Wednesday: ₹640—9 AM only.
Trend up, then dip—data’s short. Day 24: Data Odyssey sums this.
Hourly Patterns
Average by hour:
hourly_avg = data.groupby("Hour_Num")["Sales"].mean()
print(hourly_avg)
hourly_avg.plot(kind="bar", color="teal")
plt.title("Average Sales by Hour")
plt.xlabel("Hour")
plt.ylabel("Avg Sales (₹)")
plt.show()
Output:
Hour_Num
7 175.0
8 525.0
9 630.0
- 9 AM: ₹630 avg—rush king.
- 8 AM: ₹525—strong.
- 7 AM: ₹175—quiet.
Daily cycle—9 AM reigns. Day 24: Data Odyssey bars this.
Decomposition
Split trend, seasonality (needs more data, but try):
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(data["Sales"], model="additive", period=3) # 3-hour cycle
result.plot()
plt.show()
- Trend: Slight rise—₹600 to ₹650.
- Seasonal: 7-8-9 AM pattern.
- Residual: Noise—Tuesday’s ₹150 dip.
7 rows limit clarity—35 rows (Day 12) sharpen it. Day 24: Data Odyssey breaks this.
Why Time Series?
- Context: ₹642 isn’t random—9 AM trend.
- Forecast: Next 9 AM—₹650+?
- Plan: Stock for rushes, not guesses.
Priya’s Random Forest (Day 23) lacks time—series adds it. Day 24: Data Odyssey times her.
Real-World Time Series
India’s power grid analyzes usage—peaks managed. Amazon tracks daily sales—stock aligns. Priya’s 7-row series is her café’s beat—small, growing. Day 24: Data Odyssey connects this.
Challenges
- Sparse: 7 points—gaps (10-11 AM) blur.
- Short: 3 days—no weekly cycle.
- Noise: ₹150 outlier—smooth it?
Day 12’s 35 rows fill gaps—Priya scales. Day 24: Data Odyssey flags this.
Why This Matters
Time series shows Priya’s 9 AM ₹630 avg—stock 40 samosas daily, not random ₹642. Without it, she misses cycles; with it, she plans—profit up. Scale it: time series predicts India’s monsoons—crops thrive. Day 24: Data Odyssey rhythms her.
Recap Summary
Yesterday, Day 23: Data Odyssey deployed Priya’s models—Random Forest predicted ₹642, “Busy” live. Today, Day 24: Data Odyssey introduced time series—9 AM ₹630 avg, daily cycles in 7 rows. It’s her time step.
What’s Next
Tomorrow, in Day 25: Data Odyssey – How Do We Forecast with Time Series?, we’ll predict: What’s Priya’s next 9 AM? We’ll use her series with simple forecasting, aiming past ₹642. Bring your curiosity, and I’ll see you there!










