Forecasts reconciliation with fable.bayesRecon

Introduction

This vignette shows how to use the forecast reconciliation methods implemented in the fable.bayesRecon package. In particular, we will apply t-Rec method (Carrara et al. 2025) to the swiss_tourism dataset, which contains monthly counts of nightly stays in Switzerland, aggregated by Canton.

The method t-Rec is implemented by the function bayesRecon_t(), which can be used within the reconcile() function of fabletools to specify the reconciliation strategy.

First, we load the required packages:

library(bayesRecon)
library(dplyr)
library(tsibble)
library(fable)
library(fabletools)
library(fable.bayesRecon)
library(ggplot2)
library(scales)

Data preparation

Let’s import the data in the tsibble format. To do so, we follow these steps:

Load the swiss_tourism dataset from the bayesRecon package. This is a list containing the mts (multivariate time series) object, the aggregation matrix and the size of the hierarchy.
Take the mts component of the list, which contains the time series data.
The first column contains the aggregated data for all cantons; we will drop it for now and we will reconstruct it later.
Convert the remaining time series into a tsibble using as_tsibble().
Optionally, rename the columns for clarity with rename().
Aggregate the data by summing the number of tourist in each canton, using aggregate_key().

data <- swiss_tourism$ts[, -1, drop = FALSE] |>             # Remove 1st column
  as_tsibble() |>                                           # Convert in tsibble
  rename(Month = index, Canton = key, Tourists = value) |>  # Rename
  aggregate_key(Canton, Tourists = sum(Tourists))           # Aggregate all cantons

data |>  head(5) |> kable()

Month	Canton	Tourists
2005 Jan	<aggregated>	2680116
2005 Feb	<aggregated>	3017746
2005 Mar	<aggregated>	3238735
2005 Apr	<aggregated>	2067465
2005 May	<aggregated>	2185950

The resulting object data, contains time series, stacked vertically, with the time index (column Month), the grouping variable (column Canton), and the time series values (column Tourists). For aggregated time series, the column Canton is set to <aggregated>.

Note that fabletools allows more complex hierarchies with more grouping variables, however here we reproduce the hierarchy in Carrara et al. (2025).

Models fitting

We remove from each time series the last 12 observations, which will be used as test set. We use the function filter() and the yearmonth() format to specify the cutoff date. This allows us to evaluate the performance of the reconciliation method on the holdout set.

Then, we fit a base forecasting model to the data with an ETS model with additive trend and seasonality; this is implemented in fable with function ETS(). We will reconcile the results by using the methods fable.bayesRecon::bayesRecon_t (t-Rec) and fabletools::min_trace (MinT Wickramasuriya et al. (2019)).

fit <- data |>
  filter(Month < yearmonth("2024 Feb")) |>                  # Remove test set
  model(base = ETS(Tourists ~ trend("A") + season("A"))) |> # fit ETS model
  reconcile(t = bayesRecon_t(base),                         # Reconcile with t-Rec
            mint = min_trace(base))                         # Reconcile with MinT

fit |> head(5) |> kable()

Canton	base	t	mint
Aargau	<ETS(A,A,A)>	<ETS(A,A,A)>	<ETS(A,A,A)>
Appenzell Ausserrhoden	<ETS(A,A,A)>	<ETS(A,A,A)>	<ETS(A,A,A)>
Appenzell Innerrhoden	<ETS(M,A,A)>	<ETS(M,A,A)>	<ETS(M,A,A)>
Basel-Landschaft	<ETS(A,A,A)>	<ETS(A,A,A)>	<ETS(A,A,A)>
Basel-Stadt	<ETS(A,A,A)>	<ETS(A,A,A)>	<ETS(A,A,A)>

During this step, base forecasts are fitted, but predictions are not generated yet. Thus, we specificy the reconciliation strategy to set up the model, however, the actual reconciliation will be performed when we produce forecasts.

Reconciliation

Given the fitted models, we can now make one-year ahead predictions. This is done applying the forecast() method to the fitted model. The forecast horizon can either be specified as a number of periods (e.g., h = 12 for 12 months) or as a date (e.g., h = "1 year").

fc <- fit |>
  forecast(h = "1 year")

# Showing only the one step ahead forecast
fc |> filter(Month == yearmonth("Feb 2024")) |> head(5) |> kable()

Canton	.model	Month	Tourists	.mean
Aargau	base	2024 Feb	N(54237, 2.2e+07)	54236.715
Aargau	t	2024 Feb	t(233, 54373, 4714)	54373.130
Aargau	mint	2024 Feb	N(54247, 2.2e+07)	54247.224
Appenzell Ausserrhoden	base	2024 Feb	N(7781, 1450930)	7781.081
Appenzell Ausserrhoden	t	2024 Feb	t(233, 7801, 1199)	7800.641

The forecasts object contains both base (base) and reconciled forecasts (mint or t), specified in the .model column. As for the fitted models, Canton and Month are respectively the name of the time series and the time index. The column Tourists contains the probabilistic forecasts, encoded as objects from the distributional package. Note that both base and mint models return Normal distributions while t returns a Student-t distribution. Point forecasts are returned as well: the .mean column contains the mean of the predictive distribution.

Summary of results

The following pipe computes the average performance, according to three accuracy measures, grouped by model:

results <- fc |>
  accuracy(data, measures = list(RMSSE = RMSSE, MASE = MASE, CRPS = CRPS)) |>
  group_by(.model) |>
  summarise(RMSSE = mean(RMSSE), MASE  = mean(MASE), CRPS  = mean(CRPS))

results |> kable(digits = 3)

.model	RMSSE	MASE	CRPS
base	0.670	0.930	16606.34
mint	0.670	0.928	15829.08
t	0.668	0.924	15443.06

Visualisation

We can visualize the forecasts using the function autoplot(). Below we plot the aggregated series showing both base and reconciled forecasts against the actual data.

# Aggregated series
fc |>
  filter(is_aggregated(Canton)) |>
  autoplot(data, level=95) +
  scale_y_continuous(labels = label_number(suffix = "K", scale = 1e-3))+
  labs(title = "Aggregated Swiss Tourism", y = "Tourists", x = NULL) +
  theme_minimal()

The plot below shows a zoomed in version of the forecasts for the aggregated series. The prediction intevals (95% confidence) of both MinT and t-Rec are smaller than the base forecasts. This is because both methods produce coherent forecasts, which are more accurate than the base forecasts for the aggregated series, as shown in the accuracy table above.

# Zoom on the forecasts for the aggregated series
fc |>
  filter(is_aggregated(Canton)) |>
  autoplot(data |> filter(Month >= yearmonth("2022 Jan")), level = 95) +
  scale_y_continuous(labels = label_number(suffix = "K", scale = 1e-3))+
  labs(title = "Aggregated Swiss Tourism (Jan 2022 - Jan 2025)", y = "Tourists", x = NULL) +
  theme_minimal()

We now show the forecasts for two bottom level time series, i.e. two cantons: Ticino and Luzern. Both plots are restricted to the last 3 years.

# Canton Ticino
fc |>
  filter(Canton == "Ticino") |>
  autoplot(data |> filter(Month >= yearmonth("2022 Jan")),
           level=95) +
  scale_y_continuous(labels = label_number(suffix = "K", scale = 1e-3))+
  labs(title = paste("Canton: Ticino"), y = "Tourists", x = NULL) +
  theme_minimal()

# Canton Luzern
fc |>
  filter(Canton == "Luzern") |>
  autoplot(data |> filter(Month >= yearmonth("2022 Jan"),
                                 Canton == "Luzern"),
           level = 80) +
  scale_y_continuous(labels = label_number(suffix = "K", scale = 1e-3))+
  labs(title = paste("Canton: Luzern"), y = "Tourists", x = NULL) +
  theme_minimal()