Macroeconomic Forecasting Using Data from Social Media


In this paper, I build a series of economic sentiment indices for Russia based on news posts and comments on them from the Russian social network VKontakte. Text from the social network is processed, and the Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture (GSDMM) model is used to highlight news posts on economic topics. To check whether the obtained indices really describe consumer and business sentiment, I compare them with existing indices: the consumer sentiment index and the Purchasing Managers’ Index (PMI) for the manufacturing and service sectors in Russia. I use the indices constructed to predict macroeconomic indicators for Russia using machine learning methods (Random Forest, Extremely Randomised Trees, Gradient Boosting, and XGBoost). I compare the mean square errors (MSE) of the machine learning models with the MSEs of a first- order autoregressive model. In almost all cases, the errors of the machine learning models are smaller.