Personal data usage and privacy considerations in the COVID-19 global pandemic

Data has become increasingly important and valuable for both scientists and health authorities searching for answers to the COVID-19 crisis. Due to difficulties in diagnosing this infection in populations around the world, initiatives supported by digital technologies are being developed by governments and private companies to enable the tracking of the public’s symptoms, contacts and movements. Considering the current scenario, initiatives designed to support infection surveillance and monitoring are essential and necessary. Nonetheless, ethical, legal and technical questions abound regarding the amount and types of personal data being collected, processed, shared and used in the name of public health, as well as the concomitant or posterior use of this data. These challenges demonstrate the need for new models of responsible and transparent data and technology governance in efforts to control SARS-COV2, as well as in future public health emergencies.

The growing production and use of data, made possible by increasingly powerful and specialized digital technologies, has empowered the emergence of new forms of knowledge production through sophisticated computational modeling and algorithms. In this new context, data becomes ever more important and valuable in a variety of contexts, including social, political and economic interests 1 . During the COVID-19 pandemic, the introduction of a previously unidentified etiological agent and the peculiarities of its accompanying disease present challenges and pose risks to the lives and health of populations worldwide, necessitating an urgent response. As a result, personal data from diverse sources has been requisitioned, under the presumption of ethical and legal usage, to investigate scientific questions based on populational characteristics, as well as data from laboratories and hospitals, among others.
A worldwide effort by scientists, organizations and health practitioners is being undertaken to close gaps in knowledge as quickly as possible to enable health authorities to introduce efficient clinical management and prevention measures to address the pandemic, including the agile implementation of improved diagnostic capacity and the rehabilitation of COVID-19 cases in a timely manner. These actions require articulation between governmental measures and different segments of society in order to maximize disease control efforts.
The WHO has advised that each country, in accordance with respective risk assessments, be prepared to respond to possible scenarios and rapidly implement necessary measures to reduce viral transmission and minimize economic and social impacts 2 . As a result, high quality data is needed to assess basic epidemiological patterns. Unfortunately, uncertainties surrounding COVID-19 also extend to the quality of data available to researchers, not only with respect to understanding underlying epidemiological patterns of disease, but also in the construction of mathematical models aimed at providing evidence to support decision making at diverse levels.
Considering the enormous burden posed by diagnosing infection in the general population, technological initiatives have been developed to enable the tracking of citizens' symptoms, contacts and movements, elements considered essential to the design of infection surveillance strategies by governments. Great hope lies in the development of applications that collect data on individuals, including their geolocation information and movements 3 . These practices raise questions regarding the type and amount of data required, and ethical, legal and technical challenges permeate related data collection, access, sharing and usage issues 4,5 .
Apple and Google recently announced the joint development of a tool to track COVID-19 infection in a partnership aimed at ensuring interoperability between iOS and Android operating systems. According to the companies, users can opt in at their discretion, but there has been no mention of an option to subsequently withdraw consent. The tool, according to published specifications 6 , bears similarities to other contact tracing solutions, broadly inspired by those already in operation in Singapore and proposals under development in Europe, such as DP-3T (Decentralized Privacy-Preserving Proximity Tracing) 7 or the PEPP-PT (Pan-European Privacy-Preserving Proximity Tracing) 8 project and MIT Safe Paths Platform, which seek to maximize privacy 9 .
These mobile system solutions, which can be broadly classified as contact tracing systems, generally function through the short-range exchange of anonymous identifiers via Bluetooth technology. Depending on the solution, an application made available by national health authorities can be installed, or the technology may eventually be "bakedinto" operating systems. Users who receive positive coronavirus test results register their status in the application, which then communicates this to respective health authorities; others with whom the user came into close contact during the previous 14 days will also receive alerts 10 . As these are technologies still under development and undergoing maturation, differences in implementation could, over time, prove very significant; as an example, consider what appears to be the centralized focus of the PEPP-PT compared to the decentralized approach of DP-3T.
The current panorama surrounding the Coronavirus epidemic indicates that, during the next phases in which society will continue adapting to living with the virus, the use of personal data and applications or devices will play a prominent role not only in gauging contact, but also for purposes such as verifying the compliance of isolation or quarantine measures, which may extend to probabilistic contagion verification or managing permissions for citizens to go out in public, among many other uses.
It is important to remember that data collection through applications and smartphones requires access to these technologies and users must necessarily be familiar with their usage; this implies that the data collected will be representative of certain populational groups. Accordingly, adopted measures must consider health inequalities and accommodate differences in the impacts of solutions on diverse segments of populations.
In addition to location tracking, encouraging users to self-report symptoms and automatically sending alerts about possible contact with infected individuals, personal data, such as patient health information, is being used in other ways. In the UK, for example, government agencies have been working with technology companies to build a COVID-19 repository containing patient data. These companies were hired by the National Health Service (NHS) to assist with the elaboration of predictive models using artificial intelligence and patient data. The initiative was justified by the need for information regarding the burden on health services in real time using hospitalization data and intensive care bed availability, as well as equipment and supply needs.
The NHS has declared that the data in this repository is confidential, anonymized and stored in a government database, and that it will remain under its control and subject to severe restrictions under data protection legislation; nonetheless, the initiative has aroused the public's mistrust regarding ethical, privacy and data protection aspects of these citizens' private information 11 .
Questions and challenges have been raised regarding the public's trust in the institutions, whether governmental or private, responsible for processing personal data.
This wariness and questioning does not aim to prevent the use of data in the response to the pandemic, but highlights the need to establish safeguards to ensure a balance between individual and collective interests as well as to increase societal confidence in the institutions processing data for public health purposes 12,13 . In Brazil, the General Data Protection Law (LGPD), which was approved and sanctioned in 2018, is scheduled to take effect in August 2020; however, this could change as bills currently under consideration by the Brazilian congress seek to delay adoption until 2021.
LGPD represents a milestone in the regulation of personal data, since it applies to all personal data handling operations, including in the arena of digital media, whether by individuals or public and private companies. This law was devised to protect the fundamental rights of freedom and privacy 14 .
Informed self-determination is undoubtedly a fundamental aspect to be taken into consideration regarding the use of personal data, together with guarantees of transparency, security and the minimization of data usage. However, in the case of emergency situations and others in the public interest, such as a public health crisis, the use of personal data is allowed in the absence of citizens' consent, provided that safeguards are put in place, the data is used precisely to achieve specified purposes and the agencies authorized to process data are qualified in accordance with regulations Compliance with general data protection laws, therefore, requires technology, infrastructure and specialized personnel to ensure that personal data are processed in a lawful, fair and responsible manner. Moreover, accountability must be guaranteed through the monitoring of data processing activities by designated authorities authorized to apply sanctions in the case of transgressions. In some countries, partnerships between government, universities and research institutes have created data centers to process and provide access to anonymized data in a secure and controlled manner to support investigative research in the public interest 19 .
Anonymized or aggregated data are not considered personal data by data protection laws, since the identification of individuals is protected. However, even without referring to any specific individuals, groups could still be harmed due to the aggregation of information on locale, ethnicity, health situations and socioeconomic conditions, necessitating ethical scrutiny regarding the potential benefits generated by such evidence.
Linnet Taylor has called attention to the fact that no protection exists against irresponsible technologies, as data protection laws focus exclusively on the protection of personal data, yet do not cover the freedoms and political rights of collective groups.
Civil society groups must be allowed to participate in the governance of technologies.
She argues that technology companies must be transparent and accountable to society in order to validate their legitimacy as actors on behalf of the government and population, at least those companies that, in light of the pandemic, have partnered with governments and now participate in the governance of citizens' data 20 .
Considering that data can be used and shared by different people and organizations simultaneously, the main issues that need to be addressed pertain to responsible data governance based on transparency and citizen empowerment to fortify trust and establish balanced and fair relationships between individuals and organizations 21 .
The legitimacy of collecting, processing, sharing and using personal data does not come from access to this data, but rather from trust in whomever possesses it, treats it with transparency and operates within legal parameters. From this perspective, the use of personal data to face COVID-19 and future public health emergencies must be guided by transparency, verification and accountability, beginning with collection and extending onwards to processing operations and the purported use of data, as well as by whom and for how long 22 .
Clear and transparent terms and conditions must be applied regarding the access, sharing and use of the data collected in the name of public health, especially by private companies or through public-private partnerships. How and by whom will this data be accessed, processed and used? Will the data be stored, reused or discarded after the initial objectives are achieved? How will the data be protected? In the case of abuse or neglect, who will be held responsible? These and other questions should be asked and answered explicitly.
Another regulatory aspect that deserves attention pertains to intellectual property rights, as the selection, organization or availability of data stored inside databases is protected by intellectual property rights 23 . Databases that can be integrated with data from other sources to subsidize the development of new technologies, including treatment and prevention technologies for COVID-19, will be subject to ownership rights and could potentially incur costs related to access.
Partnerships between governments, technology companies and universities are necessary to enable the extraction of reliable knowledge from large volumes of data.
The agreements covering these ventures must clearly specify the roles of the parties involved, as well as usage of the presumed and achieved results. The establishment of protocols with guiding principles providing for the agile and practical application of data processing in cases of collective interest, such as the current health emergency, is urgently needed, especially considering the national and transnational utilization of personal data collected by companies around the world.
Responsible data governance also entails the description of data processing and analysis methodologies, as data can be provided as proof, as evidence, in decision making for both public policy and science 24 . Importantly, any machine learning-based algorithm is representative of the pattern or regularity of what it was intended to measure.
Algorithms are powerful and important resources that cannot be separated from causal explanations due to the risks of making decisions based only on automated results and predictions. The predominant role of the scientific method is thusly to validate and increase the reliability and usefulness of results. Indeed, science is preoccupied with the questioning of assumptions, values and biases in order to distinguish opinions from evidence.
Regulations are the only mechanism capable of establishing limits on the processing of personal data by governments and private corporations, even in a health crisis, to avoid negative impacts resulting from temporary relaxation, which have the potential to become permanent, as was seen in the United States following the September 11 th attacks in 2001. Public surveillance strategies were introduced using existing and emerging technologies at the time, justified by the monitoring of suspicious individuals and in order to avoid future terrorist aggression, resulting in lawful changes arising from fear instilled in society 25 .
The adoption of more just, responsible and sustainable data governance models, designed to protect and defend ethical and regulatory principles, serves to increase the confidence of individuals and society in the use of personal data to respond to situations of legitimate public interest. Aspects related to the privacy rights, the protection of personal data and the rights of groups do not preclude the use of personal data, especially in response to a pandemic. The public health emergency provoked by SARS-COV-2 highlights the pressing need for new forms of personal data governance that include civil society, with the goal of promoting equitable benefits for society as a whole. Matta, E. Rabello and F. Gouveia supported the writing of the original article and were also responsible for editing and revising the text. The authors would like to thank Andris K. Walter for English language revision and copyediting services.