Context
The project focuses on developing a robust forecasting model for predicting daily air cargo tonnage across multiple hierarchical levels. To serve a major logistics client, the model addresses the multi-tiered structure of air cargo operations, spanning from cost centers and airports down to specific customers and commodities. Accurate forecasting at each level is crucial for effective logistics planning and resource allocation. By leveraging exogenous features—variables identified as strong tonnage predictors—we enhance the model’s performance, aligning it with complex business requirements and providing high-resolution daily forecasts.
Requirements
Granular Forecasting: Provide daily tonnage predictions across hierarchical levels, from cost centers to specific commodities.
Feature Inclusion: Incorporate exogenous variables that significantly influence tonnage, as determined by business analysis.
Explainability: Ensure comprehensive explainability, from SHAP-based insights to actionable business-oriented interpretations, so stakeholders understand the model's behavior.
Trend and Seasonal Insights: Enable identification of patterns on daily, weekly, and monthly scales, supporting strategic planning.
Continuous Monitoring: Track experiments, model performance, and feature impact to maintain accuracy and relevance in a dynamic environment.
Scalable Deployment: Implement a scalable, cloud-based deployment for secure access to dashboards and continuous monitoring.
Approach
Hierarchical Forecasting Model:Construct a forecasting model using Nixtla’s framework to handle the hierarchical structure, spanning cost centers, buildings, customers/airlines, and commodities.
Data Preprocessing:Use
pandas
andnumpy
for data cleaning, feature engineering, and structuring pipelines to prepare data at the daily level.
Incorporation of Exogenous Features:Integrate key exogenous variables identified as influential predictors to enhance model accuracy, aligning with specific business requirements.
Hyperparameter Tuning:Apply
optuna
to perform hyperparameter tuning, refining model performance across hierarchical levels and improving predictive accuracy.
Dashboard Creation and Visualization:
Develop interactive dashboards using Dash to display:Trends and seasonality (daily, weekly, monthly)
Hierarchical predictions across levels
Feature importance and impact on forecasts
Influence of each lower level on the next higher level
Explainability:Generate SHAP values for feature attribution to explain model predictions. Translate SHAP insights into actionable business-level interpretations for clearer stakeholder understanding.
Experiment Tracking and Monitoring:Use MLFlow for tracking experiments, managing model versions, and monitoring feature performance over time.
Deployment:Deploy the model on AWS EC2 for scalable access.
Securely expose the Dash dashboard via AWS CloudFront for real-time, accessible forecasting insights.
Technologies Used
Modeling and Forecasting:
nixtla
for hierarchical forecasting,sklearn
for supplementary ML techniques, andoptuna
for hyperparameter tuning.
Data Processing:
pandas
andnumpy
for data manipulation and pipeline construction.
Dashboarding:
Dash
for interactive, web-based dashboards showcasing trends, seasonality, and feature impact.
Explainability:
SHAP
for feature attribution; business-layer explainability for actionable insights.
Monitoring and Experiment Tracking:
MLFlow
for experiment tracking, model versioning, and performance monitoring.
Deployment: AWS EC2 for scalable model deployment, with CloudFront for secure dashboard exposure.