Forecasting Aggregate Retail Sales with Google Trends 1

As the internet grows in popularity, many purchases are being made in online stores. Google Trends is an online tool that collects data on user queries and forms categories from them. We forecast the dynamics of both aggregate retail sales and individual categories of food and non-food products using macroeconomic variables and Google Trends categories that correspond to various product groups. For each type of retail, we consider the best forecasting models from macroeconomic variables and try to improve them by adding trends. For these purposes, we use pseudo-out-of-sample nowcasting as well as recursive forecasting several months ahead. We conclude that forecasts for food and non-food products can improve significantly once trends are added to the models. is explained by the fact that the demand for the relevant components of the tourism industry, such as services of restaurants, hotels, and travel agencies, is strongly correlated. The authors divided into categories standard pre-travel queries like food, weather, hotels, and picked corresponding without an index created through principal component analysis were used as benchmarks. obtained results indicate that the proposed method improves the accuracy of forecasting monthly tourism volumes in Beijing compared to two benchmarks. Similar work was done by Padhi and Pati (2017) to forecast tourism in The takes into account the fact that for tourism languages


Introduction
Expenditures on food and non-food retail goods to a significant degree determine aggregate demand, the dynamics of which are largely related to the phase of the business cycle in the economy. In particular, during deep crises (such as the crisis caused by the global pandemic), stimulating aggregate demand is the key factor for the economic recovery. In this regard, a timely task is to identify variables that potentially explain dynamics and help forecast retail sales. These can include, beyond standard macroeconomic indicators, the data on the dynamics of the number of queries for individual product groups, which help track the vol. 80 no. 4 increase and decrease of user interest. It is interesting to see to what extent search query data can help forecast retail sales. According to Yandex.Radar, 2 the most popular search engine for Russia is Yandex. However, the series from the Search for Words 3 service for assessing user interest in specific topics based on queries in Yandex are too short for our study. Since Google also holds a significant share of the search engine market in Russia (38% as of 31 January 2021), we used Google Trends to track dynamics of user interest. To address the task at hand, we consider regressions with some basic set of key macroeconomic indicators that explain food and non-food retail and then use Google Trends to improve the forecasting properties of various models.
The research paper has the following structure. Section 2 provides an overview of research in the area of identifying consumption forecasting factors using Google Trends. Section 3 contains the data that are used for calculations. Section 4 examines the impact of Google search queries data on improving the quality of aggregate retail forecasts. In Section 5 we test the forecast quality. In Section 6, we apply the methodology from Sections 4 and 5 to several product categories. Section 7 concludes.

Literature overview
Using Google Trends in research is a relatively new trend. However, today, one can frequently encounter the use of Google Trends for various purposes in the literature. This is due to the relative simplicity of obtaining and processing these data and their convenient and understandable presentation. Furthermore, since the internet and the Google are used by a large number of people, such queries collectively constitute representative indicators of user interest. Thus, Google Trends contains a huge database of various indicators, including those capable of describing consumer behaviour.
Amongst the earliest studies using data on the dynamics of number of internet search queries, the paper Choi and Varian (2009), which attempts to nowcast retail, vehicle, real estate, and travel sales, stands out. The authors use categories from Google Insights for Search, later merged with Google Trends. Vossen and Schmidt (2011) were the first to try using Google Trends data as a metric to forecast private consumption in the United States. The authors concluded that when compared to survey-based indices, such as the University of Michigan Consumer Sentiment Index 4 and the Conference Board Consumer Confidence Index, 5 the principal components are extracted from Google Trends categories help provide more accurate forecasts. december 2021 Based on the insights from the above-mentioned papers, many studies have been conducted to support the usefulness of Google Trends in improving consumption forecasts. As the number of active internet users grows, more and more shoppers are using it to collect information about products and make shopping choices. Thus, there is increasingly more information in Google Trends that is important for forecasting, including forecasting retail sales, large purchases, services, and aggregate consumption.
When it comes to forecasting in grocery retail, we should mention the paper by Boone et al. (2017), where Google Trends is used to forecast demand for snacks, improving mean absolute percentage error by 2.2-7.66%. Kuchler et al. (2020) use Google Trends to analyse consumer behaviour in grocery stores for commercial purposes. Silva et al. (2019) add a trend for the word 'Burberry' to a neural network model to forecast the sales of the luxury clothing store of the same name. Oh et al. (2021) use the trend 'Winter Jacket in NYC' and applied analysis of variance and Pearson correlation to find patterns in consumer behaviour when looking for seasonal clothing during the winter months. Ellingsen (2017) uses data on aggregated retail sales in Norway and forecasts it using 51 categories and 148 keywords in Google Trends.
Google Trends data are also used to forecast housing and car prices. For example, Wu and Brynjolfsson (2015) forecast trends in the real estate market by incorporating trends alongside endogenous variables such as the housing price index and housing sales volume. Dietzel (2016) uses combinations of categories and keywords in the area of the housing market and the adjacent construction market. Wijnhoven and Plant (2017) demonstrate that Google Trends are more effective at forecasting the auto market than social media posts. Furthermore, they point at the absence of difference in the correlation of the prices of expensive and inexpensive cars with the Google Trends data.
Search engine query data reflect tourism interest well and are useful for forecasting tourism product consumption. Li et al. (2017) forecast demand for the Beijing tourism industry using Google Trends. The authors construct a composite search index using the generalised dynamic factor model (GDFM). The choice of the model is explained by the fact that the demand for the relevant components of the tourism industry, such as services of restaurants, hotels, and travel agencies, is strongly correlated. The authors divided into categories standard pre-travel queries like food, weather, hotels, and picked corresponding keywords. Autoregressive moving average models (ARMA(1,1)) with and without an index created through principal component analysis were used as benchmarks. The obtained results indicate that the proposed method improves the accuracy of forecasting monthly tourism volumes in Beijing compared to two benchmarks. Similar work was done by Padhi and Pati (2017) to forecast tourism in India. The paper by Dergiades et al. (2018) takes into account the fact that queries for tourism products in one country come in different languages and from different search platforms. Using two procedures for non-causality testing, it was found that, taking into account these two factors, the search intensity indicator better forecasts the number of foreign visitors. But tourism is not the only sector for which the Google Trends is used to forecast the demand. The analysis provided in the paper Tijerina et al. (2020) confirms the viability of this tool for assessing patient interest in non-surgical facial treatments.
The above studies focus on solving specific business problems, be it housing sales or visits to a cosmetologist. Forecasting aggregate consumption is a more interesting task in terms of macroeconomics; however, it is appropriate to use the dynamics of consumer interest here as well. Fasulo et al. (2018) forecast household consumption expenditures in Italy using Google Trends data related to expenditure keywords. Scott and Varian (2015) use Google Trends categories to nowcast the University of Michigan Consumer Sentiment Index. Woo and Owen (2019) examine the contribution of Google Trends data to forecasting private consumption in the United States. For this, data on the consumption of long-term and short-term products and services are used. For each category of products, a linear regression is constructed that includes macroeconomic variables such as consumer confidence indices, a volatility index, real disposable income, and the rate on a three-month government bond. The authors use categorisations from the Bureau of Economic Analysis, which include words associated with each product consumption category (Vossen and Schmidt, 2011). These keywords are matched against categories in Google Trends. The principal components are extracted from the obtained trends. In addition to categories, the authors also consider queries for the keywords 'recession' and 'layoff ' . They then attempt to forecast consumption by adding either components, keyword queries, or both to the model. The authors confirm that trends help better forecast changes in consumption. The paper by Pekar (2020) tests whether social media contain useful signals about future consumer spending beyond those contained in macroeconomic variables commonly used to forecast it. The study used predictors based on Google Trends data. Both social media posts and Google Trends data separately were shown to significantly improve consumer spending projections compared to models using only macroeconomic variables. It should be noted that many of the above studies highlight the importance of justifying keyword and category choices for specific Google Trends series.
In our research, we attempt to assess the significance of various Google Trends data in forecasting retail sales. Using aggregated retail data, we can separately examine the demand for food and non-food products in the Russian economy. We use a detailed breakdown of retail categories by Rosstat in order to match them with Google Trends categories. Furthermore, we consider important macroeconomic variables that also potentially characterise the dynamics of retail sales.

Data
We consider the period from January 2015 to February 2021 inclusive. This study uses monthly data of the Rosstat on total retail sales revenue. We use indices of the physical volume of turnover of food and non-food retail trade as target variables. Google Trends series are viewed as a month against the same month of the previous year (seasonal difference) as in the papers Vossen and Schmidt (2011) and Woo and Owen (2019), which helps avoid adjusting for the complex nature of seasonality in these series. In the paper by Bessonov (2003), it was noted that a seasonal wave that varies in the multiplicative representation can remain in a series even after seasonal differencing; however, according to the paper by Ghysels and Osborn (2001), it can be argued that, for example, even possible deterministic (in an additive form) and non-stationary stochastic seasonality is eliminated by seasonal differencing. In this connection, we assume that a significant part of the seasonal component is eliminated within the framework of the transformation under consideration. Also, considering the data in this form helps address the possible problem of non-stationarity 6 caused by the saturation of the internet with users in the considered interval. Appendix A (see online version of this paper) shows the results of the extended Dickey-Fuller test for the presence of unit roots in the series under consideration. For food products, the null hypothesis of the presence of non-stationarity is not rejected for four out of seven categories of trends, and for non-food products, for 21 out of 25.
As macroeconomic variables, by analogy with the research by Woo and Owen (2019), we consider the Russian volatility index to control changes in consumption due to fluctuations in the stock market. We also include in the model an indicator of consumer confidence from the Russian Public Opinion Research Centre (VCIOM), which shows how favourable the present time is for making large purchases in the opinion of Russian citizens. The higher the index value, the more favourable Russians consider the current moment for major purchases. 7 We take into consideration the price of Brent crude oil since for Russia the price of oil has a significant impact on the economy, even if changes in this price are caused by global demand shocks (Kilian, 2009;Polbin et al., 2020), and as an alternative indicator we consider the real effective foreign exchange rate, the dynamics of which are largely related to the price of oil. Oil price data were adjusted for the seasonally adjusted US CPI.
Appendix B provides a list of variables with links to sources. Figure 1 shows the dynamics of the listed macroeconomic variables. It is noteworthy that in April 2020, due to the consequences of the pandemic and widespread lockdowns, we can observe minimum values for the physical volume indices of sales in non-food retail, the oil price, and in the consumer confidence index. vol. 80 no. 4 In order to select the proper Google Trends series for research, we turned to a detailed breakdown of aggregated retail data by product. We have combined some of the products into categories by meaning. The left column of Table 1 shows the categories of food and non-food items, the right column shows the corresponding categories mapped with Google Trends. We decided to use categories as this would help identify general interest in products based on a variety of queries. In the case of keywords, it is not entirely obvious which queries to use, and it is highly likely that some queries that would help map the overall state of interest in these products are not taken into account. For several food product categories by Rosstat, we did not succeed in finding the mapping with the Google Trends, and there were no data for four non-food categories, but most of the retail categories were consistent with Google Trends. Thus, we have selected seven trends for retail sales of food items and 25 for nonfood items. 8

Source: Google Trends
Google Trends provides relative frequencies of queries for the selected category and scales the data from zero to 100 based on the selected period. The maximum value corresponds to the largest number of queries, while all other values are scaled to this maximum. We download data for all the selected trends since 2004 to obtain the true dynamics of the series. We take the data for Russia and for web searches. We transform the data to the 'month against the same month of the previous year' terms and consider the series in logarithms. We then truncate the series from 2015 for computational comparability. All of the above listed macroeconomic variables, with the exception of the consumer confidence index, we also take in logarithms and in terms of the 'month against the same month of the previous year' . To provide an example, we depict food and non-food categories in Google Trends on Figure 2 and Figure 3. 4. Pseudo-out-of-sample nowcasting of aggregate retail using Google Trends As retail data are released with a delay of about one month, it is often necessary to forecast this indicator only for a month ahead. In this section, we assess the contribution that Google Trends series make to forecasting food and non-food products using pseudo-out-of-sample nowcasting. To align the results for the convenience of display in graphs, we normalise all variables in the range from zero to 100. Such normalisation does not detract from the quality of forecasting models, but it will enable comparison of the forecasting errors of different retail categories. In order to take into account the influence of the previous values of the variables on the current reading of retail sales, we use the lagged values of macroeconomic variables, including target variables. Up to three lags inclusive are considered for each variable. Next, we create baseline models, which we will try to improve by adding trends. To do this, we consider all possible model combinations of no more than three 9 explanatory variables (without trends), using the following methodology for each combination. 1) We split the sample into training and test samples. The test sample includes 40% of the last values of all variables. If lags are involved in the regression, then the size of the test sample is proportionally reduced. 2) On the training sample, we estimate the model using the ordinary least squares method. 3) We build a forecast on the test sample one month ahead and obtain an estimate of the target variable. We add this observation to the training sample and again evaluate the model, taking into account the new observation. We continue this procedure until the exhaustion of the test sample. Finally, we calculate the mean absolute forecast error (MAE) using the estimated values of the target variable. From the combinations obtained, we select the ten (for the sake of transparency of results) models with the highest R 2 . We will call these models the baseline models. We will add various combinations of trends (no more than two) to the baseline models and see if the values of the determination coefficient and forecast errors improve. Let us clarify that we are using the adjusted R 2 since this makes it possible to compare models with different numbers of regressors. We then repeat the methodology above for each series of models. Finally, we compare the obtained values of R 2 and MAE for regressions with and without trends. If the coefficient of determination has increased, and along with this the forecast error has decreased, then we judge that trends are improving forecasts for retail.
The calculations use abbreviated notations for variables. Target variables are designated as food goods and nonfood goods (indices of physical volume of retail sales for food and non-food items, respectively). As for the explanatory variables, ipd stands for the consumer confidence index, real rate is the real exchange rate, oil is the real price of oil, and rvi is the volatility index. Adding '-1' , '-2' and '-3' to the notation indicates the lag.
We are ranking the data in tables by the R 2 obtained from the last evaluated model in the course of the forecasts. Thus, this R 2 shows the proportion of the explained variation of the entire sample under consideration excluding the last observation. Tables 2 and 3 show the best models for forecasting of the retail sales of food and non-food goods.

Source: authors' calculations
It is noteworthy that in each regression there are lagged target variables, and in a significant part of the regressions there are lagged indicators of consumer confidence and the exchange rate. The demand for food products is less volatile than for non-food products, so the previous retail values explain the current ones well. Due to significant size of imports among food products (29% at the end of 2020 10 ) and the existence of the law of one price for tradable goods, the exchange rate is also a factor explaining the dynamics of retail sales of food items. Finally, the consumer confidence indicator correlates with the consumer goods market as it characterises the consumer purchasing power.
In the best regressions for non-food goods, there are no lagged target variables. The current value of the consumer confidence indicator, the oil price, and their first lags can be seen in almost all combinations of variables under consideration. The consumer confidence indicator reflects the demand for expensive purchases, including those that are part of non-food items, e.g., buying a car, furniture, or a mobile phone. Oil prices through the exchange rate channel affect the market for non-food goods through the purchase of intermediate goods from abroad and the import of finished products. Furthermore, oil price shocks have a direct impact on the economic well-being of households and, therefore, on their purchasing power.
Adding grocery trends to regressions resulted in 80 extended combinations. The statistics on the popularity of trends in the resulting models are shown in Table 4.  Figure 4 shows the averaged forecasts for the first ten models, in which adding Google Trends series reduces the forecast error the most. It can be observed that prior to the pandemic the quality of forecasts for baseline and extended models does not differ significantly, while after January 2020 significant differences in forecasts are noticeable, and it becomes obvious that extended models are better at forecasting retail sales of food items in this period.
Since this forecasting method yields several values of R 2 (updated when the sample is increased by one month), it seems interesting to display their dynamics over time. We took ten extended models with the largest difference in determination coefficients compared to the baseline ones. At each step of forecasting, we averaged vol. 80 no. 4 the values of R 2 for all the models under consideration. The results are shown in Figure 5. It can be observed that the values of the determination coefficient for the extended models gradually decrease over time, while R 2 for the baseline models sharply decreases after May 2020. At the end of the forecast period, R 2 of the baseline and extended models differ by a third.

Source: authors' calculations
Some of the extended regressions are presented in Table 5. The Baked Goods trend continues to make the largest contribution to the increase in the share of explained variation, and the Meat & Seafood trend is also observed in a significant part of the regressions. Other trends slightly reduce the forecast error and improve the value of R 2 . december 2021 For non-food trends, it was found that in 206 regressions trends improve the coefficient of determination and reduce the forecast error. Table 6 shows statistics on the popularity of trends among the combinations obtained.

Source: authors' calculations
For non-food retail, extended models helped obtain more accurate forecasts over the period under consideration ( Figure 6). However, the time of the collapse during the pandemic could not be forecasted using either the basic or the extended models. All the models that improve the forecast error the most feature the Magazines trend.
Let us consider the dynamics of the averaged determination coefficients obtained from models with the maximum improvement of R 2 after adding trends vol. 80 no. 4 ( Figure 7). We can observe that the proportion of the explained variation over the entire period in the extended models is much higher than in the baseline ones. Furthermore, after the start of the pandemic, we can observe a significant increase in R 2 in the extended models, while in the baseline models an increase in the coefficient of determination is also observed, but to a much lesser degree.  Table 7 shows some of the regressions with Google Trends added. The trends that maximise the coefficient of determination include Magazines as well as Motorcycles and Home Storage & Shelving. We can also observe that the Computer Hardware trend significantly helps increase the share of explained variation in non-food retail sales in the period under review. december 2021

Testing the predictive power of models for aggregated retail
For a more rigorous inspection of the quality of the obtained forecasts, we will carry out the Diebold-Mariano test (Diebold and Mariano, 2002). This test will allow us to compare the quality of the forecasts for model pairs with and without Google Trends series added. The null hypothesis implies that the two models yield the same forecast quality. Note that in the considered extended models the MAE of the forecast is strictly reduced, so we will use a one-tailed test to test the null hypothesis. In addition to nowcasting, we also test the quality of longer recursive forecasts (up to and including ten months ahead). Thus, we use models in which the addition of trends significantly improves their quality for nowcasting, but we test their robustness for longer forecast horizons. By recursion we mean the following: for each forecast horizon, we replace the lagging values of the target variable (which are used for forecasting) with their estimates obtained in the previous steps. The forecast experiment is as follows.
1) We split the sample into training and test samples. The test sample includes the last 30 values of all variables, which is 40% of the entire sample. If the regression contains lagged variables, then the size of the training sample is reduced by the maximum lag value. 2) On the training sample, we estimate the model using the ordinary least squares method. We model forecasts on the assumption that future values of retail sales are unknown to us. In models with lagged values of the target variable, we take into account the fact that retail sales data are not published instantly but with a delay of approximately one month. Therefore, on the test sample, starting from step two, the real value of the first lag of retail sales is replaced by its estimate obtained at the previous step. Similarly, the second lag is replaced by the estimate from step three, and the third lag, by the estimate from step four. vol. 80 no. 4 3) We make a forecast for horizons two to ten months ahead. 4) We add one actual observation to the training sample (the next after the last observation of the training sample in the previous step) and re-evaluate the model. We continue this procedure until the exhaustion of the test sample. We obtain forecasts for combinations with trends in the same manner. It should be noted that for the rest of the variables (apart from the target variable), we use their actual values at each step of the forecast -that is, in this case, we get conditional forecasts for these variables. The purpose of the experiment is to try to answer the question of whether trend series add essential additional information to forecasting models. We use the obtained pairs of forecast vectors for different forecast horizons to carry out the Diebold-Mariano test. Due to the low power of this test on short samples, we present the results at the 5% and 10% significance levels. If the p-value is greater than 0.05 (or 0.1), the difference between the forecasts of the two models is statistically insignificant. Note that, by analogy with the papers by Ulyankin (2020) and Gareev (2020), we make allowances for the small size of the test sample (Harvey et al., 1997) when calculating the Diebold-Mariano statistics. Table 8 shows the number of models that turned out to be significant after this test for various forecast horizons for food retail sales.

Source: authors' calculations
We can observe that at both the 5% and 10% significance levels for the nowcasting case the null hypothesis is often rejected in favour of the one-tailed alternative. We can also note that for each forecast horizon we found models whose forecasting power significantly improves after adding trends. The relative instability of the results (the number of such models) at the 5% significance level can be explained by the low power of the Diebold-Mariano test. Table 9 shows examples of baseline and extended models for food goods with statistically significantly different quality of one-month-ahead forecast the 10% significance level. Baked Goods and Meat & Seafood are still among the trends in the models with the greatest difference in R 2 .
It is also interesting to look at the models obtained from recursive forecasting for several periods. As an illustration, we give examples of models with a 10-monthahead forecast at the 10% significance level (Table 10). Note that the results of the Diebold-Mariano test for many models are significant both one month ahead and ten month ahead, which confirms robustness of the results to a changes in the length of the forecast horizon. Table 11 shows the number of models that turned out to be significant after the Diebold-Mariano test for non-food goods for forecasts of various lengths.

Source: authors' calculations
The less frequent rejection of the null hypothesis at the 5% level in some cases can be caused, as mentioned above, by the low power of the test on a short vol. 80 no. 4 sample. Table 12 shows examples of models for non-food goods that turned out to be significant for the forecast horizon of one month at the 10% significance level. Magazines, Motorcycles, and Computer Hardware continue to be among the trends that maximise R 2 .  Table 13 shows the models with ten-month-ahead forecasts whose quality is statistically significantly different at the 10% significance level. Similar to the models with food goods, for some of the models the results turned out to be significant both for one and for ten forecasting steps.
6. Pseudo-out-of-sample nowcasting and forecasting of some product categories with Google Trends added In this section, we discuss the breakdown of aggregate retail sales into specific product categories and try to understand how the information contained in Google Trends can be useful in forecasting these series. Table 14 shows the categories for which we were able to collect data. 11 In total, there are nine categories of food products and twelve categories of non-food products.
It is important to note that above, we used only macroeconomic variables to forecast the aggregated retail sales above. Quality forecasting of each specific category of goods may require its own specific variables that are important for a particular industry. Since the selection of such variables is a separate research objective that goes beyond the scope of our study, we experimentally use the same macroeconomic variables as for aggregate retail. Such an experiment aims to answer the same question: whether trends bring in information essential for predicting the series under consideration. It is important to note that when searching for the best models for a specific product group, we do not limit the set of trends to only those that coincide thematically.
We repeated the forecasting experiment described in Section 4 for each product category. For food products, it was found that in 121 regressions trends significantly improve forecasts. The highest values of the determination coefficient were obtained for the categories Tea and Fresh vegetables. Examples of models for food products are shown in Table 15.

Source: authors' calculations
For non-food products, a total of 970 regressions were obtained, with the highest R 2 for the variables Computers, Medicines, and Building materials. Examples of baseline and extended models for non-food products are shown in Table 16.
We also did the Diebold-Mariano test for nowcast models based on the methodology outlined in Section 5. Again, in addition to nowcasting, we also look at forecasts up to ten months ahead. The results obtained for the 10% significance level are shown in Table 17. Note that the table shows those categories for which the value of R 2 turned out to be high enough. vol. 80 no. 4

Source: authors' calculations
It can be noted that for the Tea category the results of the Diebold-Mariano test are quite stable for all forecast horizons. Table 18 shows examples of baseline and extended models for food products retail sales whose forecast quality is statistically significantly different both one month ahead and ten months ahead at the 10% significance level.
The results of the Diebold-Mariano test for non-food products are shown in Table 19. It is noticeable that for the Computers category, the ratio of combinations where the addition of trends significantly improves nowcasting is several times higher than for any considered category of food and non-food products. Also, for this category, for different forecasting steps, a fairly high number of combinations with trends that significantly improve the forecasting power of the models remains. Table 20 shows examples of basic and extended models for the retail sales of non-food products whose forecast quality is statistically significantly different both one month ahead and ten months ahead at the 10% significance level.

Conclusion
In this paper, we attempted to assess the contribution that Google Trends series on search queries for various product groups make to the predictive power of models for food and non-food retail sales, as well as sales of specific product groups. We used linear regression models for pseudo-out-of-sample nowcasting with the addition of trends. We used the Diebold-Mariano test to test the quality of the models' forecasts, both for the case of nowcasting and for recursive forecasting several months ahead. For the baseline models, we used such macroeconomic variables as the Russian volatility index, the price of Brent crude oil, the real effective exchange rate, and the consumer confidence indicator from the VCIOM.
We found that in the considered time interval with pseudo-out-of-sample nowcasting adding trends can help improve the quality of models and their forecasting power. This is especially noticeable in models for non-food retail sales where the value of the coefficient of determination increases in the best models by about 0.2. The forecasting power of the models improved significantly for recursive forecasts as well. As for individual product categories, we also obtained results indicating a significant improvement in the forecasting power of the models after adding trends.
It is important to note that the results described above have been obtained on queries in the Google search engine, which occupies only about a third of the search engine market in Russia, but Yandex, which is the leader in the Russian search engine market, as of the time of writing, does not provide data that satisfy our research needs. It is also noteworthy that a significant part of the test sample overlaps with the period of a strong shock to global demand caused by the pandemic, which adds to the significance of the results obtained.
As areas for further research, the use of trends can be considered both for forecasting total consumption in the Russian economy and for forecasting retail sales of specific Russian retailers. Also, a separate area for research can be the selection of specific variables that are essential for predicting certain categories of retail. The results obtained in this paper can be useful both to specialised bodies responsible for the relevant areas of economic policy and to private retailers for improving business performance.