Hard Numbers: Open Consumer Price Database

Abstract

We document a new source of consumer price microdata. The new database allows researchers studying consumer price behaviour to access current and granular raw statistical observations. The range of observed prices fully covers goods and services of the Rosstat’s CPI sample and extends beyond it. In this paper, we pursue two objectives. First, we describe the data collection mechanism, data structure, and their access protocols, as well provide four complete illustrations of their application using open API: i) training machine models of product classification based on text labels, ii) real-time tracking of product prices, iii) estimating hedonic regressions for product groups, and iv) calculating arbitrary analytical price indices. Second, we share a set of basic skills and technologies for the benefit of researchers interested in creating their own sources of alternative data.