Updated March 2026 – Now compatible with latest Python 3.12+ features and libraries. Track your expenses smarter with basic machine learn...
Updated March 2026 – Now compatible with latest Python 3.12+ features and libraries. Track your expenses smarter with basic machine learning!
In 2026, with rising costs everywhere (especially in places like Kathmandu!), tracking personal expenses manually is tedious. Why not build a simple yet powerful Python script that:
- Loads your expenses from a CSV file
- Visualizes spending patterns with charts
- Uses basic ML (K-Means clustering) to automatically group similar expenses (e.g., "daily essentials" vs "impulse buys")
This is perfect for beginners/intermediates in Python, Pandas, Matplotlib, and scikit-learn. No web apps or complex GUIs — just a clean script you can run locally.
Prerequisites Install these libraries (run in terminal/cmd): pip install pandas matplotlib scikit-learn
Step 1: Prepare Your Expenses Data Create a file named expenses.csv in the same folder as your script. Example content (copy-paste into a text editor and save as .csv):
Date,Category,Amount,Description 2026-01-05,Food,450,Lunch at local dhaba 2026-01-10,Transportation,200,Bus fare to office 2026-01-15,Food,1200,Groceries for week 2026-01-20,Entertainment,800,Movie tickets 2026-02-01,Household,1500,Electricity bill 2026-02-10,Food,300,Snacks 2026-02-15,Transportation,150,Petrol 2026-03-01,Food,600,Family dinner 2026-03-05,Subscriptions,499,Netflix monthly
Step 2: Load and Analyze Data with Pandas Basic stats and category breakdown.
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.preprocessing import LabelEncoder
# Load data
df = pd.read_csv('expenses.csv')
# Convert Date to datetime
df['Date'] = pd.to_datetime(df['Date'])
# Basic overview
print("Total Expenses:", df['Amount'].sum())
print("\nExpenses by Category:")
print(df.groupby('Category')['Amount'].sum().sort_values(ascending=False))
# Pie chart for category distribution
category_sum = df.groupby('Category')['Amount'].sum()
plt.figure(figsize=(10, 7))
plt.pie(category_sum, labels=category_sum.index, autopct='%1.1f%%', startangle=90)
plt.title('Expenses Breakdown by Category (2026)')
plt.axis('equal')
plt.show()Step 3: Visualize Monthly Trends (Bar Chart) Add this for monthly overview:
# Monthly spending
df['Month'] = df['Date'].dt.to_period('M')
monthly = df.groupby('Month')['Amount'].sum().reset_index()
plt.figure(figsize=(10, 6))
plt.bar(monthly['Month'].astype(str), monthly['Amount'], color='skyblue')
plt.title('Monthly Expenses Trend')
plt.xlabel('Month')
plt.ylabel('Total Amount (NPR)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()Step 4: Add Basic ML – Cluster Expenses with K-Means This is the "AI-powered" part! We cluster expenses based on Amount and Category to find hidden patterns.
# Prepare data for clustering
le = LabelEncoder()
df['Category_encoded'] = le.fit_transform(df['Category'])
X = df[['Amount', 'Category_encoded']]
# Apply K-Means (choose 3-5 clusters; here 4)
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
df['Cluster'] = kmeans.fit_predict(X)
# Show clusters
print("\nExpenses grouped by Cluster:")
for i in range(4):
print(f"\nCluster {i}:")
print(df[df['Cluster'] == i][['Category', 'Amount', 'Description']].head())
# Optional: Visualize clusters (scatter plot)
plt.figure(figsize=(10, 7))
scatter = plt.scatter(df['Amount'], df['Category_encoded'], c=df['Cluster'], cmap='viridis')
plt.colorbar(scatter, label='Cluster')
plt.xlabel('Amount')
plt.ylabel('Encoded Category')
plt.title('Expense Clusters (K-Means)')
plt.show()What clusters might mean:
- Cluster 0: Small daily spends (e.g., food/snacks)
- Cluster 1: Medium transportation/household
- Cluster 2: Higher entertainment/subscriptions
- Cluster 3: Outliers/large bills
(Note: Actual clusters depend on your data. In the sample, it groups small food/transport, medium bills, etc.)
Full Complete Script (Copy-Paste Ready)
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.preprocessing import LabelEncoder
# Load and prepare
df = pd.read_csv('expenses.csv')
df['Date'] = pd.to_datetime(df['Date'])
print("=== Budget Summary ===")
print("Total Expenses:", df['Amount'].sum())
print("\nBy Category:\n", df.groupby('Category')['Amount'].sum().sort_values(ascending=False))
# Pie chart
category_sum = df.groupby('Category')['Amount'].sum()
plt.figure(figsize=(10,7))
plt.pie(category_sum, labels=category_sum.index, autopct='%1.1f%%', startangle=90)
plt.title('Expenses Breakdown by Category (2026)')
plt.show()
# Monthly bar
df['Month'] = df['Date'].dt.to_period('M')
monthly = df.groupby('Month')['Amount'].sum().reset_index()
plt.figure(figsize=(10,6))
plt.bar(monthly['Month'].astype(str), monthly['Amount'], color='skyblue')
plt.title('Monthly Expenses')
plt.xticks(rotation=45)
plt.show()
# Clustering
le = LabelEncoder()
df['Category_encoded'] = le.fit_transform(df['Category'])
X = df[['Amount', 'Category_encoded']]
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
df['Cluster'] = kmeans.fit_predict(X)
print("\n=== Clustered Expenses ===")
for i in range(4):
print(f"Cluster {i} (size: {len(df[df['Cluster']==i])}):")
print(df[df['Cluster']==i][['Category','Amount','Description']].to_string(index=False))
# Scatter plot of clusters
plt.figure(figsize=(10,7))
plt.scatter(df['Amount'], df['Category_encoded'], c=df['Cluster'], cmap='viridis')
plt.colorbar(label='Cluster')
plt.xlabel('Amount')
plt.ylabel('Category (Encoded)')
plt.title('AI-Powered Expense Clusters')
plt.show()
Conclusion & Next Steps Congratulations! You now have a smart budget tracker that uses basic AI to reveal spending patterns. Run it monthly with updated CSV data.
Ideas to extend:
- Add income tracking
- Export reports to PDF
- Integrate with Google Sheets
Try it with your real expenses and share in the comments: What cluster surprised you most?
Happy coding! If you liked this, check my older posts on Python Matplotlib Graphs or Recommendation Systems.
COMMENTS