Data collection and privacy leak in China during the Covid-19

On: October 3, 2021
Print Friendly, PDF & Email


Measures of data collection and management taken by China during the Covid-19 have caused a lot of controversies. At the beginning of the epidemic, people from Wuhan were asked to register personal information on excels when traveling to other places. These excels with privacy information, including addresses and phone numbers spread online to remind residents to beware of those from Wuhan. As a consequence, people from Wuhan received harassment calls and offensive messages, telling them to return to the ‘virus-stricken city’. (South China Morning Post, 2020) People surrendered personal information to fight the pandemic, while the data aren’t protected well by the government, leading to rampant data leaks.

Now a robust surveillance system has been built to collect and monitor big data about the pandemic. People upload information including ID numbers, home addresses, health records, travel histories in 14 days to gain health codes and itinerary codes. Based on the information uploaded by users and data collected from public services, the system decides the color of QR codes, which represents the level of virus risk.

The green code allows for unlimited mobility, while the yellow one requires a one-week quarantine and the red one requires a two-week quarantine.

Privacy concerns obscured by collective discourse

According to a discourse analysis of Chinese mainland reports during the Covid-19, personal privacy concerns were ranked lower than concepts addressing collectivism and communitarianism. (Liu & Zhao, 2021) With mass social mobilization under the general rhetoric of patriotism and collectivism, public health is highly emphasized at the expense of privacy. It is required to have health codes scanned before taking public transportation or entering public places, which is regarded as a way to ensure public safety. However, people are not informed that their current locations would be sent to the system once codes are scanned, allowing the authorities to track their movements. (The New York Times, 2020) The technology of data collection may be helpful for the epidemiological survey, while violating individual privacy.

The destruction of citizen privacy could also hinder public health through weakening trust and promoting dissent. (Ada Lovelace Institute, 2020). Although complaints to the government are invisible for strict online restrictions, the privacy leak casts a shadow over anti-epidemic work. Results of epidemiological surveys of Covid-19 patients are posted online, including age, gender, address, and travel trajectory. The results are used to remind the public not to access places where patients have ever been, however, leaks of detailed personal information from epidemiological surveys foster human flesh search. It is reported that some patients have to endure online gossips and personal attacks while receiving treatments, which harms patients’ mental health.(Global Times, 2021)

Health codes become new identity signals

The health code has been deeply embedded in Chinese daily lives. A person who has once been to ‘medium-risk’ or ‘high-risk’ places may find the health code turning yellow or red, which represents mandatory quarantine. There have been cases that people with yellow codes are not allowed to work in the company even though they show no symptoms of Covid-19. (The New York Times, 2020)

In the context of a pandemic, health codes have become new identity signals, indicating to what extent one is connected with viruses. The system divides people not only based on health conditions, but also where they have been, instilling a form of regional prejudice.

Dataveillance in the context of the epidemic

Someone argues that the health code transforms health into a tool for social control and discipline. (Cong, 2021) As Foucault has stated, the technologies aim to produce docile, complaint bodies by containment measures and the power of communal gaze and shaming. (Forcault, 1991) The technique of the health code enhances the capacity of disciplinary powers, while simultaneously making individuals and the population more susceptible to measurement and control. (Cong, 2021)

It is critical to highlight that people need update their information by themselves to keep their health codes valid. Though the self-reporting system seems to entail people’s free will, it’s more like a matter of formality since data and traces have been monitored by the government. Cong indicates that self-reporting aims to transform power wielded by the authority into individual self-discipline, resulting in a finer-grained and covert form of control that each user internalizes. In the context of the pandemic that everyone could spread the infection to others, the internalized form of control makes individuals gain a sense of collective responsibility, solidarity, and security. (Cong, 2021)

Even though citizens support the employment of these technologies for prevention and control of the epidemic, there are also concerns about the possible abuse of surveillance systems after the pandemic. In Hangzhou, the government floated a contentious proposal to link the health code with individual health indicators, to collect and analyze data on citizens’ medical records and lifestyle management. The proposal was backed down after receiving widespread criticism for its obvious disregard for individual privacy. Following the statement of Boyd and Crawford that ‘data are not generic and context is important’ (Boyd & Crawford, 2012), we must keep in mind that data and data sensibility are highly contextual. Location data might be beneficial for epidemiological study, while they could be misused for non-health purposes, such as re-calibrating political power relations. (Zwitter & Gstrein, 2020)

Individual privacy or public health?

Though the surveillance system was invasive, the public viewed the data collection as positive and effective in fighting the epidemic. (Liu & Graham, 2021) Someone argues that public health must take precedence over civil liberties for the urgency of the situation and the severity of the disease. (Tony Blair Institute for Global Change, 2020) While others indicate that the trade might be false if these technologies are unable to achieve the expected public health benefits. (Kitchin,2020)

It’s tough to evaluate how effective China’s surveillance system was in preventing and controlling the pandemic. The health code does not track real close contacts but calculates a person’s risk level. Without knowing the hypothesis and data used in the algorithm, it’s impossible to assess the efficiency. (Cong, 2021) The technical feasibility and validity of the system are also doubted, as people could switch off their mobile phones while traveling through ‘medium-risk’ or ‘high-risk’ areas.

There are no definite conclusions about how to balance public health and personal privacy. However, the possible misuse of big data after the epidemic requires widespread caution. Privacy and data protection are critical values that do not disappear in a crisis. (Zwitter & Gstrein, 2020) The risk of invasion of privacy should be kept to a minimum. Under no circumstances should rampant privacy leaks occur.


Ada Lovelace Institute. (2020). Exit Through The App Store. April 20.

Bamford, R., Dace, H., Macon-Cooney, B., & Yiu, C. (2020). A Price Worth Paying: Tech, Privacy and the Fight Against Covid-19. Tony Blair Institute for Global Change.

Boyd danah, & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication & Society, 15(5), 662–679.

Cong, W. (2021). From Pandemic Control to Data-Driven Governance: The Case of China’s Health Code. Frontiers in Political Science, 3, 8.

Foucault, M. (1991). Governmentality. In G. Burchell, C. Gordon, & P. Miller (Eds.), The Foucault effect: Studies in governmentality (pp. 87–104). University of Chicago Press

Kitchin, R. (2020). Civil liberties or public health, or civil liberties and public health? Using surveillance technologies to tackle the spread of COVID-19. Space and Polity, 24(3), 362–381.

Liu, C., & Graham, R. (2021). Making sense of algorithms: Relational perception of contact tracing and risk assessment during COVID-19. Big Data & Society, 8(1),

Liu, J., & Zhao, H. (2021). Privacy lost: Appropriating surveillance technology in China’s fight against COVID-19. Business Horizons.

Mozur, P., Zhong, R., & Krolik, A. (2020, March 2). In Coronavirus Fight, China Gives Citizens a Color Code, With Red Flags. The New York Times.

Shen, X. (2020, May 12). Personal information collected to fight Covid-19 is being spread online in China. South China Morning Post.

Xu, K. (2021, January 21). Leaks of personal information from epidemiological survey cast shadow over anti-epidemic work—Global Times. Global Times.

Zwitter, A., & Gstrein, O. J. (2020). Big data, privacy and COVID-19 – learning from humanitarian expertise in data protection. Journal of International Humanitarian Action, 5(1), 4.

Leave a Reply