Digit analysis for Covid-19 reported data


  • Jean-François Coeurjolly


Thee coronavirus which appeared in December 2019 in Wuhan has spread out worldwide and caused the death of more than 330,000 people (as of May 29, 2020, submission date for the present article). Since February 2020, doubts were raised about the numbers of confirmed cases and deaths reported by the Chinese government. In this paper, we examine data available from China at the city and provincial levels and we compare them with Canadian provincial data, US state data and French regional data. We consider cumulative and daily numbers of confirmed cases and deaths and examine these numbers through the lens of their first two digits and in particular we measure departures of these first two digits to the Newcomb-Benford distribution, often used to detect frauds. Our finding is that there is no evidence that cumulative and daily numbers of confirmed cases and deaths for all these countries have different first or second digit distributions. We also show that the Newcomb-Benford distribution cannot be rejected for these data.


Additional Files