Can Metadata Guide Variable Selection for Macroeconomic Forecasting?

Por Andrew R. Garcia; Marco Vega

April 2026

Idioma: English

Compartir en:

Subgerente de Investigación Económica

Resumen:

This paper asks whether structural metadata from an institutional registry carries enough signal to guide variable selection in macroeconomic forecasting. We study this question using two complementary approaches. The first, Metadata ε-Greedy, is a stochastic search policy that uses permutation-invariant embeddings of registry metadata to guide a fixed-budget search over predictor subsets, with forecasting loss as the only feedback signal. The second, Metadata Bayes, performs variable selection entirely within metadata space: it constructs group-level priors from institutional descriptors, updates them via partial correlation with the target, and selects predictors through Thompson sampling, without ever evaluating a forecasting model during selection. Both methods are evaluated on forecasting Peruvian headline CPI under two forecasters, a Vector Autoregression (VAR) and a Random Forest, and benchmarked against random search, greedy forward selection, LASSO, Bayesian Ridge, PCA, and a state-of-the-art Bayesian variable selection method. Metadata Bayes, despite never observing forecasting loss during selection, achieves out-of-sample accuracy competitive with all baselines including the Bayesian benchmark. Metadata ε-Greedy further improves on these results under the VAR during the COVID shock period. Together, the results suggest that registry metadata encodes enough economic structure to serve as a meaningful proxy for predictive relevance, complementing rather than replacing existing forecasting pipelines.

Descargar documento de trabajo

Descargar PDF