This repository features a dual-engine analytics platform that transforms raw e-commerce transaction data into actionable business strategies. The project implements Log-Log Regression for price elasticity and behavioral quintile binning for customer segmentation.
The suite provides high-fidelity decision support for retail stakeholders:
- Pricing Intelligence: Identifies price-sensitive vs. inelastic products to optimize profit margins.
- Customer Segmentation: Clusters the user base into actionable cohorts to drive retention.
The system evaluates product-level demand sensitivity by filtering for statistical significance (
- Log-Log Regression: Used to calculate constant elasticity coefficients for 54 statistically significant products.
- Automated Decision Logic: Translates coefficients into strategies such as "Discount for Volume" or "Optimize Margin".
Utilizing Recency, Frequency, and Monetary metrics, the platform segments customers to resolve the "One-Timer" data bottleneck common in e-commerce datasets.
- Cohort Composition: A Treemap visualization identifies the distribution of the population across five primary segments.
- Prescriptive Actions: Maps specific interventions, such as personalized win-back campaigns for "At Risk" users and premium support for "Big Spenders".
- Ingestion: Automated download and path-mapping of the Olist Brazilian E-Commerce dataset.
- Processing: Consolidation of orders, items, and payments datasets into a unified transaction log.
- Modeling: Implementation of R and M quintiles to prioritize high-value users in low-frequency environments.
| Component | Description |
|---|---|
| data/raw_orders_data.csv | Consolidated transaction log |
| data/final_market_audit.csv | Elasticity model outputs |
| data/rfm_segments.csv | Behavioral segmentation results |
| src/dashboard.py | Streamlit UI logic |
| src/models.py | Regression and RFM logic |
- Revenue Optimization: The dashboard identifies products where price increases will not significantly impact volume, protecting margins.
- Customer Retention: Isolated "At Risk" cohorts with an average inactivity period of ~480 days for targeted re-activation.
- Growth Engineering: Identified "New Customers" with high recency (avg. 50.2 days) as the primary targets for loyalty conversion.


