The ModalForecast package implements parametric modal
ARIMA models utilizing the Skewed Distribution (SKD) family. Instead of
connecting the expected value (mean) to covariates, the model connects
the conditional mode to the systematic autoregressive
integrated moving average (ARIMA) components.
By modeling the mode directly, this framework helps mitigate the effects of localized extremes, asymmetry, and non-normal behavior, providing robust centralized predictions under asymmetric error distributions.
To construct a modal regression model, we require a flexible parametric continuous distribution where the mode is explicitly parameterized and differentiable. We adopt the generalized SKD family, which supports robust inference through heavy tails and asymmetry. The package currently implements the Skew-Normal, Skewed Student-t, and Skewed Laplace distributions.
Note on the “normal” distribution: In
ModalForecast, specifyingdist = "normal"does not invoke the standard symmetric Gaussian distribution. Instead, it refers to the Skew-Normal distribution from the SKD family. The standard normal is recovered asymptotically only when the estimated skewness parameter \(\gamma = 1\).
Let \(y_t \in \mathbb{R}\) be the response variable at time \(t\). We assume \(y_t\) follows a distribution from the SKD family with mode \(\mu_t\), scale \(\sigma\), skewness parameter \(\gamma \in (0,\infty)\) (or effectively \(p \in (0,1)\)), and tail parameter(s) \(\boldsymbol{\nu}\):
\[y_t \sim \text{SKD}(\mu_t, \sigma, \gamma, \boldsymbol{\nu})\]
The family is constructed using a scale mixture representation that ensures mathematical tractability while allowing heavy tails. A crucial property of this parameterization is that the probability density function reaches its global maximum exactly at \(y_t = \mu_t\). Consequently, \(\mu_t\) represents the true conditional mode of the distribution.
Instead of the standard mean-based ARIMA, we model the sequence of conditional modes \(\mu_t\):
\[ \mu_t = c + \sum_{i=1}^p \phi_i y_{t-i} + \sum_{j=1}^q \theta_j \epsilon_{t-j} \]
where \(\epsilon_t = y_t - \mu_t\) is the asymmetric prediction error, \(p\) is the autoregressive order, and \(q\) is the moving average order.
The parameter vector \(\boldsymbol{\Theta} = (c, \boldsymbol{\phi}, \boldsymbol{\theta}, \sigma, \gamma, \boldsymbol{\nu})^\top\) is estimated via Maximum Likelihood Estimation (MLE). Conditionally on the initial values, the log-likelihood function over the sequence of \(n\) observations is constructed explicitly as a function of \(y_t\):
\[ \ell(\boldsymbol{\Theta}) = \sum_{t=1}^n \log f(y_t | \mu_t, \sigma, \gamma, \boldsymbol{\nu}) \]
where \(f(\cdot)\) is the probability density function of the chosen SKD distribution (e.g., Skew-Normal, Skewed Student-t, Skewed Laplace), and \(\mu_t\) embeds the recursive ARIMA structure. ## Installation
You can install the development version of ModalForecast from GitHub with:
# install.packages("devtools")
devtools::install_github("chedgala/ModalForecast")The following plots demonstrate the diagnostic capabilities and
forecasting performance of the ModalForecast package using
the well-known lynx dataset.
The envelope() function constructs simulation envelopes
based on the exact theoretical distance distributions of the SKD family
(e.g., half-normal, half-t, exponential). This provides a visually
intuitive goodness-of-fit assessment to help select the best
distribution among normal, t, or
laplace.
Below is an evaluation of the Skew-Normal, Skewed Student-t, and
Skewed Laplace fits on the lynx dataset:
The diagnostics() function provides additional analysis
including ACF/PACF of Randomized Quantile Residuals (RQR), and the
summary() method handles analytical Fisher Information
matrix standard errors.
Comparison between traditional Gaussian ARIMA (Mean) and the Modal ARIMA (Mode) utilizing different members of the SKD family. The package computes both Asymptotic prediction intervals for standard series, and Parametric Bootstrap simulated prediction intervals for greater coverage in small sample settings.
Below is a brief tutorial showing how to adjust a modal ARIMA model to empirical data.
library(ModalForecast)
library(forecast)
# 1. Load Empirical Data (Lynx)
data(lynx)
y <- log10(lynx) # log-transformation is common for this dataset
# 2. Find the best SKD Error Distribution (Normal vs T vs Laplace)
# We fit a base Model and compare Information Criteria (e.g. AIC)
fit_n <- fit_modal_arima(y, order=c(2, 0, 0), dist="normal")
fit_t <- fit_modal_arima(y, order=c(2, 0, 0), dist="t")
fit_l <- fit_modal_arima(y, order=c(2, 0, 0), dist="laplace")
c(Normal = AIC(fit_n), Student = AIC(fit_t), Laplace = AIC(fit_l))
# 3. Use the rigorous Auto Modal ARIMA selector globally on the best distribution
# Since Skew-Normal returned the lowest AIC (-4.46), it wins.
fit_auto <- auto.modal.arima(y, d=0, max.p=5, max.q=5, dist="normal")
# 4. Print the automatically fitted modal model
print(fit_auto)
# Model Summary & Diagnostics
summary(fit_auto)
diagnostics(fit_auto)
# Forecast modal trajectory with multiple confidence levels (alphas)
pred <- forecast(fit_auto, h=10, level=c(80, 95, 99))
# The package natively supports the 'forecast' library ecosystem:
library(forecast)
# 1. Plot rich visualization with confidence bands identical to standard ARIMA
autoplot(pred)
# 2. Evaluate forecast metric diagnostic functions (ME, RMSE, MAE, MAPE, etc.)
accuracy(pred)The fitted object returns standard coefficients, scale
sigma, and skewness gamma. If
gamma is significantly different from 1, it indicates
positive asymmetry (right-skewness if \(>1\)) or negative asymmetry (\(<1\)), capturing patterns that typical
least-squares ARIMA would miss. If the specified distribution possesses
additional tail parameters (like nu for Skewed Student-t),
they are estimated automatically.