A full-stack e-commerce analytics system. Upload any e-commerce CSV — the system auto-detects columns, cleans data, computes analytics, runs ML models, and displays everything on a dark-themed dashboard.
conda create -n ecom_analytics python=3.10
conda activate ecom_analytics
pip install -r backend/requirements.txt
cd backend
uvicorn main:app --reload
# API → http://localhost:8000
# Swagger → http://localhost:8000/docscd frontend
npm install
npm run dev
# App → http://localhost:5173Upload data/sample_ecommerce.csv — or any e-commerce CSV you have (UCI Online Retail, Kaggle datasets, etc.).
| Feature | Description |
|---|---|
| Flexible CSV ingestion | Accepts CSV (UTF-8 or Latin-1) and Excel (XLSX/XLS) — no fixed column schema required |
| Auto column detection | Maps column names to semantic roles (revenue, date, customer, product…) via keyword matching |
| Derived columns | Computes total = unit_price × quantity automatically when no revenue column exists |
| Data cleaning | Deduplication, null imputation, date parsing, float customer ID normalisation, negative quantity removal |
| Analytics | KPI cards, top products by revenue, monthly trend, top customers by spend |
| Apriori | Market basket association rules — what products are bought together per order |
| Repeat Purchase Prediction | Rule-based customer segmentation (Loyal / Returning / New), rebuy scores, loyalty patterns from Apriori on repeat-buyer baskets |
| RFM Customer Segmentation | Formal Recency/Frequency/Monetary scoring (1–5 per dimension) with K-Means clustering (k=3) → High Value / At Risk / Low Engagement |
| Dark UI | Black OLED theme, Fira Code/Sans fonts, glow effects, animated charts |
| Endpoint | Method | Description |
|---|---|---|
/upload |
POST | Upload CSV/XLSX (multipart/form-data) |
/summary |
GET | Column schema, null counts, describe stats |
/analytics |
GET | KPIs, top products, monthly revenue, top customers |
/rebuy |
GET | Customer segments, rebuy scores, loyalty Apriori patterns |
/rfm |
GET | RFM scores (1–5 per dimension), K-Means cluster profiles, top customers |
The system uses keyword matching on column names. Works out-of-the-box with:
| Dataset | Notes |
|---|---|
| UCI Online Retail | invoiceno, description, quantity, invoicedate, unitprice, customerid |
| Standard e-commerce | order_id, product_name, quantity, unit_price, total_price, purchase_date, purchased |
| Kaggle retail datasets | Any variation — detection is fuzzy |
Detected roles:
| Role | Matched when column name contains… |
|---|---|
revenue |
total, revenue, amount, sales, sale |
unit_price |
price, cost, rate |
quantity |
qty, quantity, units |
date |
date, time, ordered, created, timestamp |
customer_id |
customer, user, buyer, client |
order_id |
order, transaction, invoice, basket |
product |
product, item, sku, title, desc |
category |
category, cat, dept, department, segment |
target |
purchased, bought, converted, label, target, churn |
ecommerceV2/
├── backend/
│ ├── main.py # FastAPI app + CORS + route registration
│ ├── routes/
│ │ ├── upload.py # POST /upload — ingest, derive, detect, clean, pre-compute
│ │ ├── summary.py # GET /summary
│ │ ├── analytics.py # GET /analytics
│ │ ├── rebuy.py # GET /rebuy
│ │ └── rfm.py # GET /rfm
│ ├── services/
│ │ ├── cleaner.py # Generic + role-based cleaning rules
│ │ ├── analyzer.py # Conditional analytics (skips missing roles)
│ │ ├── rebuy.py # Rule-based segmentation + loyalty Apriori
│ │ └── rfm.py # RFM table + quantile scoring + K-Means clusters
│ ├── utils/
│ │ ├── state.py # In-memory DataFrame + column map + cache store
│ │ ├── detector.py # Keyword-based column role detector
│ │ └── helpers.py
│ ├── models/
│ │ └── schemas.py # Pydantic response models
│ └── requirements.txt
│
├── frontend/
│ └── src/
│ ├── api/apiClient.js # Axios wrapper for all endpoints
│ ├── pages/
│ │ ├── Home.jsx # Upload page
│ │ └── Dashboard.jsx # Main analytics dashboard
│ └── components/
│ ├── UploadForm.jsx
│ ├── SummaryCard.jsx # KPI cards
│ ├── AnalyticsChart.jsx # Bar + area charts (Recharts)
│ ├── PredictionTable.jsx # Apriori association rules table
│ ├── RebuyAnalysis.jsx # Segment donut + top customers + loyalty patterns
│ └── RFMAnalysis.jsx # RFM stat cards + cluster profiles + top customers table
│
├── data/
│ └── sample_ecommerce.csv # 200-row test dataset
├── claude.md # Project conventions for Claude Code
└── README.md
- Backend: FastAPI, Pandas, NumPy, Scikit-learn, Mlxtend
- Frontend: React 19, Vite 8, Tailwind CSS v3, Recharts, React Router v7
- ML: Apriori (mlxtend), K-Means + StandardScaler (sklearn)
- Environment: Anaconda —
ecom_analytics(Python 3.10)