The correspondence between the structure of the terrestrial mobility network and the spreading of COVID-19 in Brazil A correspondência entre a estrutura da rede de mobilidade terrestre e a propagação da COVID-19 no Brasil La correspondencia entre la estructura de la red de movilidad terrestre y la propagación de la COVID-19 en Brasil

The inter-cities mobility network is of great importance in understanding outbreaks, especially in Brazil, a continental-dimension country. We adopt the data from the Brazilian Ministry of Health and the terrestrial flow of people between cities from the Brazilian Institute of Geography and Statistics database in two scales: cities from Brazil, without the North region, and from the São Paulo State. Grounded on the complex networks approach, and considering that the mobility network serves as a proxy for the SARS-CoV-2 spreading, the nodes and edges represent cities and flows, respectively. Network centrality measures such as strength and degree are ranked and compared to the list of cities, ordered according to the day that they confirmed the first case of COVID-19. The strength measure captures the cities with a higher vulnerability of receiving new cases. Besides, it follows the interiorization process of SARS-CoV-2 in the São Paulo State when the network flows are above specific thresholds. Some countryside cities such as Feira de Santana (Bahia State), Ribeirão Preto (São Paulo State), and Caruaru (Pernambuco State) have strength comparable to states’ capitals. Our analysis offers additional tools for understanding and decision support to inter-cities mobility interventions regarding the SARS-CoV-2 and other epidemics. COVID-19; Public Health Surveillance; Epidemics ARTIGO ARTICLE This article is published in Open Access under the Creative Commons Attribution license, which allows use, distribution, and reproduction in any medium, without restrictions, as long as the original work is correctly cited. Freitas VLS et al. 2 Cad. Saúde Pública 2020; 36(9):e00184820 Introduction The world is currently facing a global public health emergency due to the COVID-19 pandemic, declared on March 11th, 2020 by the World Health Organization (WHO) 1. As of June 4th, 2020, more than 6.7 million cases have been confirmed worldwide, with almost 400,000 deaths. In Brazil, the first documented case was in the city of São Paulo on February 26th, 2020. Since then, there are about 615,000 confirmed cases and 34,000 deaths in the national territory 2 (Worldometers COVID-19: coronavirus pandemic. https://www.worldometers.info/coronavirus/, accessed on 15/May/2020; Ministério da Saúde. Painel coronavírus. https://covid.saude.gov.br/, accesssed on 14/May/2020). The inter-cities mobility network serves as a proxy for the transmission network, vital for understanding outbreaks, especially in Brazil, a continental-dimension country 3,4,5,6,7. The complex network approach 8 emerges as a natural mechanism to handle mobility data computationally, taking areas as nodes (fixed) and movements between origins and destinations as connections (flows). Some networks’ measures can be used to find the structurally more vulnerable areas in the context of the current study. The degree of a node is the number of cities that it is connected to, showing the number of possible destinations. The strength captures the total number of people that travel to (or come from) such places in each time frame. From a probability perspective, the cities that receive more people are more vulnerable to SARS-CoV-2. The betweenness centrality, on the other hand, considers the entire network to depict the topological importance of a city in the routes that are more likely to be used. In this context, this work aims to investigate the correspondences of the measures of networks with the emergence of cities with confirmed cases of COVID-19 in two scales: Brazil and the State of São Paulo. Specifically, we analyze (i) the Brazilian inter-cities mobility networks under different flow thresholds to neglect the lowest-frequency travels, especially in the beginning of the outbreak, when the interiorization of the disease was not yet in progress; and (ii) the correspondence between the statistics of networks and the spreading of COVID-19 in Brazil.

The inter-cities mobility network serves as a proxy for the transmission network, vital for understanding outbreaks, especially in Brazil, a continental-dimension country 3,4,5,6,7 . The complex network approach 8 emerges as a natural mechanism to handle mobility data computationally, taking areas as nodes (fixed) and movements between origins and destinations as connections (flows). Some networks' measures can be used to find the structurally more vulnerable areas in the context of the current study. The degree of a node is the number of cities that it is connected to, showing the number of possible destinations. The strength captures the total number of people that travel to (or come from) such places in each time frame. From a probability perspective, the cities that receive more people are more vulnerable to SARS-CoV-2. The betweenness centrality, on the other hand, considers the entire network to depict the topological importance of a city in the routes that are more likely to be used.
In this context, this work aims to investigate the correspondences of the measures of networks with the emergence of cities with confirmed cases of COVID-19 in two scales: Brazil and the State of São Paulo. Specifically, we analyze (i) the Brazilian inter-cities mobility networks under different flow thresholds to neglect the lowest-frequency travels, especially in the beginning of the outbreak, when the interiorization of the disease was not yet in progress; and (ii) the correspondence between the statistics of networks and the spreading of COVID-19 in Brazil.

Methods
The most common mobility data used in studies of this nature in Brazil are the pendular travels, from the 2010 national census 9 . In this research, we use the Brazilian Institute of Geography and Statistics (IBGE) roads data from 2016 10 , which contains the flows between cities considering terrestrial vehicles in which it is possible to buy a ticket (mainly buses and vans). This information seeks to quantify the interconnection between cities, the movement of attraction that urban centers carry out for the consumption of goods and services, and the long-distance connectivity of Brazilian cities. The North region is not included in this paper, because neither the fluvial nor the air models are covered, and their roles are crucial to understand the spreading process there, especially in the Amazon region. According to an investigation of seroprevalence of antibodies to SARS-CoV-2 11 , Northern cities are among the ones with the highest values, and six of them are located along a 2,000km stretch of the Amazon river.
The above-cited IBGE data 10 contains the travel frequency (flow) between pairs of Brazilian cities/districts in a general/typical week, considering only origins and destinations, without any information about possible connections between them. The frequencies are aggregated within the round trip, which means that the number of travels from city A to city B is the same as from B to A. We produce two types of undirected networks with a different number N of nodes to capture actions in two scales (country and state): (1) N = 4,987 -Brazil without the North region: nodes are cities and edges are the flow of direct travels between them.
(2) N = 620 -São Paulo State: a subset of the previous network, containing only cities within the São Paulo State. For simplicity, no further analysis is performed to evaluate the dependency of the network in relation to the state neighboring cities.
Cad. Saúde Pública 2020; 36(9):e00184820 Some cities are not present in our networks, due to a simplification that IBGE does: it groups small neighboring municipalities with almost no flow into single nodes. For simplicity, and considering that such places do not contain cases in the first days of the outbreak, they are not individually accounted for in our analysis.
We focus on two versions of each network for certain flow thresholds η, the η 0 (η = 0) that is the original network from the IBGE data and η d (η = d). The d corresponds to the higher flow threshold that produces the network with the largest diameter. The motivation behind η d is to get a threshold high enough to not consider the least frequent connections and to not disregard the most frequent ones 6,12 .
We also use COVID-19 data from the state daily bulletins and Brazilian Ministry of Health 2 , which are reported by place of residence and notification date. It shows that, until June 4th, 2020, the number of cities with at least one confirmed patient with COVID-19 is 3,851 in the Brazil without the North region network, which corresponds to 77% of the nodes. The analogous for São Paulo is 535 (86% of the nodes). With this data, we track the response of each measure in detecting vulnerable cities according to the evolution of the virus spreading process, as each city notified the first case.

Complex network measures
The topological degree (k) 8 of a node is the number of links it has to other nodes. As here the networks are undirected, there is no distinction between incoming and outgoing edges.
In a connected graph, there is at least one shortest path σ vw between any pair of nodes v and w. The betweenness centrality 8 (b) of a node i is the rate of those shortest paths that pass through i: MOBILITY NETWORK AND THE SPREADING OF COVID-19   3 Cad. Saúde Pública 2020; 36(x):e00184820 Some cities are not present in our networks, due to a simplification that IBGE does: it groups small neighboring municipalities with almost no flow into single nodes. For simplicity, and considering that such places do not contain cases in the first days of the outbreak, they are not individually accounted for in our analysis.
We focus on two versions of each network for certain flow thresholds η, the η0 (η = 0) that is the original network from the IBGE data and ηd (η = d). The d corresponds to the higher flow threshold that produces the network with the largest diameter. The motivation behind ηd is to get a threshold high enough to not consider the least frequent connections and to not disregard the most frequent ones 6,12. We also use COVID-19 data from the state daily bulletins and Brazilian Ministry of Health 2, which are reported by place of residence and notification date. It shows that, until June 4th, 2020, the number of cities with at least one confirmed patient with COVID-19 is 3,851 in the Brazil without the North region network, which corresponds to 77% of the nodes. The analogous for São Paulo is 535 (86% of the nodes). With this data, we track the response of each measure in detecting vulnerable cities according to the evolution of the virus spreading process, as each city notified the first case.

Complex network measures
The topological degree (k) 8 of a node is the number of links it has to other nodes. As here the networks are undirected, there is no distinction between incoming and outgoing edges.
In a connected graph, there is at least one shortest path σvw between any pair of nodes v and w. The betweenness centrality 8 (b) of a node i is the rate of those shortest paths that pass through i: Although it is a pointwise measure, it considers non-local information related to all shortest paths on the network. It is worth highlighting that in this context this centrality index is not a transportation (physical) measure but a mobility (process) one. Besides, both degree and betweenness do not account for the network flows here, but the binary (weightless) networks. The diameter of a network is the distance between the farthest nodes, given by the maximum shortest path.
The strength (s) 8 of a node on the other hand is the accumulated flow from incident edges: In which Fij is the flow between nodes i and j. Figure 1 presents illustrations of quite simple networks with the aforementioned measures calculated for each node. The bigger and redder the nodes, the higher the values associated with them. In Figure 1a, node F has the highest strength, meaning that it receives more flow than any other. Nodes B, D and E are the most connected, each with exactly four incident edges ( Figure 1b). Lastly, node F of Figure 1c has the higher betweenness value, since all shortest paths with end on G pass through it. Nodes D and E both divide the very same load of shortest paths and have intermediate values.
We assess which of the computed measures (s, k, and b) of the mobility networks better approximates the spreading of COVID-19 in Brazil. We compare the top-ranked n cities of each measure with the n cities that contain confirmed cases. We vary n from 1 to the number of cities with confirmed cases to chase the transmission dynamics. In order to verify whether the rate of correspondence between the top-ranked cities from the networks' measures and the cities with COVID-19 cases has statistical significance, we verify what are the results of picking cities at random instead of under the measures' guidance via a hypothesis test with simulated distributions 13. We perform 100,000 simulations for each n, choosing n nodes by chance and monitoring what is the rate of positive cases. (1) Although it is a pointwise measure, it considers non-local information related to all shortest paths on the network. It is worth highlighting that in this context this centrality index is not a transportation (physical) measure but a mobility (process) one. Besides, both degree and betweenness do not account for the network flows here, but the binary (weightless) networks. The diameter of a network is the distance between the farthest nodes, given by the maximum shortest path.
The strength (s) 8  Some cities are not present in our networks, due to a simplification that IBGE does: it groups small neighboring municipalities with almost no flow into single nodes. For simplicity, and considering that such places do not contain cases in the first days of the outbreak, they are not individually accounted for in our analysis.
We focus on two versions of each network for certain flow thresholds η, the η0 (η = 0) that is the original network from the IBGE data and ηd (η = d). The d corresponds to the higher flow threshold that produces the network with the largest diameter. The motivation behind ηd is to get a threshold high enough to not consider the least frequent connections and to not disregard the most frequent ones 6,12.
We also use COVID-19 data from the state daily bulletins and Brazilian Ministry of Health 2, which are reported by place of residence and notification date. It shows that, until June 4th, 2020, the number of cities with at least one confirmed patient with COVID-19 is 3,851 in the Brazil without the North region network, which corresponds to 77% of the nodes. The analogous for São Paulo is 535 (86% of the nodes). With this data, we track the response of each measure in detecting vulnerable cities according to the evolution of the virus spreading process, as each city notified the first case.

Complex network measures
The topological degree (k) 8 of a node is the number of links it has to other nodes. As here the networks are undirected, there is no distinction between incoming and outgoing edges.
In a connected graph, there is at least one shortest path σvw between any pair of nodes v and w. The betweenness centrality 8 (b) of a node i is the rate of those shortest paths that pass through i: Although it is a pointwise measure, it considers non-local information related to all shortest paths on the network. It is worth highlighting that in this context this centrality index is not a transportation (physical) measure but a mobility (process) one. Besides, both degree and betweenness do not account for the network flows here, but the binary (weightless) networks. The diameter of a network is the distance between the farthest nodes, given by the maximum shortest path.
The strength (s) 8 of a node on the other hand is the accumulated flow from incident edges: In which Fij is the flow between nodes i and j. Figure 1 presents illustrations of quite simple networks with the aforementioned measures calculated for each node. The bigger and redder the nodes, the higher the values associated with them. In Figure 1a, node F has the highest strength, meaning that it receives more flow than any other. Nodes B, D and E are the most connected, each with exactly four incident edges ( Figure 1b). Lastly, node F of Figure 1c has the higher betweenness value, since all shortest paths with end on G pass through it. Nodes D and E both divide the very same load of shortest paths and have intermediate values.
We assess which of the computed measures (s, k, and b) of the mobility networks better approximates the spreading of COVID-19 in Brazil. We compare the top-ranked n cities of each measure with the n cities that contain confirmed cases. We vary n from 1 to the number of cities with confirmed cases to chase the transmission dynamics. In order to verify whether the rate of correspondence between the top-ranked cities from the networks' measures and the cities with COVID-19 cases has statistical significance, we verify what are the results of picking cities at random instead of under the measures' guidance via a hypothesis test with simulated distributions 13. We perform 100,000 simulations for each n, choosing n nodes by chance and monitoring what is the rate of positive cases. (2) In which F ij is the flow between nodes i and j. Figure 1 presents illustrations of quite simple networks with the aforementioned measures calculated for each node. The bigger and redder the nodes, the higher the values associated with them. In Figure 1a, node F has the highest strength, meaning that it receives more flow than any other. Nodes B, D and E are the most connected, each with exactly four incident edges (Figure 1b). Lastly, node F of Figure 1c has the higher betweenness value, since all shortest paths with end on G pass through it. Nodes D and E both divide the very same load of shortest paths and have intermediate values.
We assess which of the computed measures (s, k, and b) of the mobility networks better approximates the spreading of COVID-19 in Brazil. We compare the top-ranked n cities of each measure with the n cities that contain confirmed cases. We vary n from 1 to the number of cities with confirmed cases to chase the transmission dynamics. In order to verify whether the rate of correspondence between the top-ranked cities from the networks' measures and the cities with COVID-19 cases has statistical significance, we verify what are the results of picking cities at random instead of under the measures' guidance via a hypothesis test with simulated distributions 13 . We perform 100,000 simulations for each n, choosing n nodes by chance and monitoring what is the rate of positive cases.

Geographical visualization
A geographical approach for complex systems analysis is especially important for mobility phenomena. Santos et al. 14 proposed a graph where the nodes have a known geographical location, and the edges have spatial dependence, the (geo)graph. It provides a simple tool to manage, represent, and analyze geographical complex networks in different domains 6,12 and it is used in this work. The geographical manipulation is performed with the PostgreSQL Database Management System (https:// www.postgresql.org/) and its spatial extension PostGIS. Lastly, the maps are produced using the Geographical Information System ArcGIS (http://www.esri.com/software/arcgis/index.html).

Results
This section presents the results of the topological analysis for the previously mentioned networks. Two nodes are connected when between them there is a nonzero flow, which means that the number of connections |E| decreases for increasing threshold (η). The resulting networks are undirected and, throughout the paper, both the degree and the betweenness measures do not account for the flows, but weightless edges instead. The diameter of the networks for varying η is computed and the higher threshold with maximum diameter is found for both networks: η d = 507.55 for Brazil whitouth North region and η d = 169.9 for São Paulo State.
Following the (geo)graphs approach, it is possible to visualize nodes and edges of the Brazilian mobility network in the geographical space for η d in Figure 2a. The edges for η 0 are not plotted, because there are more than 59,000 and the visualization is not clear. It is important to highlight

Geographical visualization
A geographical approach for complex systems analysis is especially important for mobility phenomena. Santos et al. 14 proposed a graph where the nodes have a known geographical location, and the edges have spatial dependence, the (geo)graph. It provides a simple tool to manage, represent, and analyze geographical complex networks in different domains 6,12 and it is used in this work. The geographical manipulation is performed with the PostgreSQL Database Management System (https:// www.postgresql.org/) and its spatial extension PostGIS. Lastly, the maps are produced using the Geographical Information System ArcGIS (http://www.esri.com/software/arcgis/index.html).

Results
This section presents the results of the topological analysis for the previously mentioned networks. Two nodes are connected when between them there is a nonzero flow, which means that the number of connections |E| decreases for increasing threshold (η). The resulting networks are undirected and, throughout the paper, both the degree and the betweenness measures do not account for the flows, but weightless edges instead. The diameter of the networks for varying η is computed and the higher threshold with maximum diameter is found for both networks: η d = 507.55 for Brazil whitouth North region and η d = 169.9 for São Paulo State.
Following the (geo)graphs approach, it is possible to visualize nodes and edges of the Brazilian mobility network in the geographical space for η d in Figure 2a. The edges for η 0 are not plotted, because there are more than 59,000 and the visualization is not clear. It is important to highlight   Figure 2c shows the map of the topological degree related to each node/city, considering all original flows (η 0 ), and in Figure 2d there is the equivalent for η d = 207.55. Key cities are labeled in the maps.
Figures 2e and 2f present the strength for the São Paulo State network, with η 0 and η d , respectively. Some cities with high strength also appear in a report 15 of most vulnerable cities to COVID-19 due to their intense traffic of people, namely São Paulo, Campinas, São José do Rio Preto, São José dos Campos, Ribeirão Preto, Santos, Sorocaba, Jaboticabal, Bragança Paulista, Presidente Prudente, Bauru, and many others. Currently, they all have a significant number of confirmed cases. Figure 3 presents the correspondence of the first n cities with COVID-19 documented cases with both the simulated data and the top-ranked nodes under s, k, and b. The n varies from 1 to 3,851 in Figure 3a and from 1 to 535 in Figure 3b. The gray region represents 95% of the occurrences of rates in the simulations for each n.
According to Figure 3, on June 4th, about 95% of the simulations have matching rates within 0.77 ± 0.01 for the Brazil without North region network, and the same volume is within 0.86 ± 0.03 for the São Paulo State. The results for node selection during the first days via the network indexes all lie above the gray region, which means that all indexes are a better heuristic than picking nodes at random in the beginning. However, immediately after May 5th, b with η d starts to touch the region in São Paulo, having, therefore, results compared to the simulations. It has the worst results for Brazil without North region as well, after a transient.
High-frequency oscillations are perceived in Figure 3 for small n, but they stabilize afterwards and follow a tendency. In Figure 3a they are pronounced until March 24th (n = 150). The matching p is at maximum in the beginning, because the first documented case was in the city of São Paulo, which is the first ranked city in all measures. The curve then decreases until reaching a region where the oscillations take place. Furthermore, the network quantifiers pose good correspondences already in the beginning of the spreading process, as the gray regions are not touched until n approaches the number of cities with at least one confirmed patient with COVID-19.

Figure 3
Correspondence (rate) between the n top ranked cities for different network criteria: s, k, and b, and cities that have at least one patient with COVID-19 in Brazil without the North region and São Paulo State.
Notes: the gray region represents the correspondences with randomly selected cities. The inset is a zoomed area of the last days until June 4 th . In both panels, the s and k with η d are overlapping in the end.
Interestingly, on March 31st, the high-frequency oscillations start to diminish in São Paulo State. A few days further, after April 7th, the betweenness centrality with η d starts to be a bad predictor for Brazil without North region and then for São Paulo State. Table 1 enumerates the first twenty ordered cities according to the best-evaluated measures and compares them side-by-side with the first twenty cities with COVID-19 cases in the Brazil without North region network. The best measures for São Paulo State are compared with each other in Table  2 as well. Although in this case s with η 0 presents good correspondences, we present the ones with η d due to its importance until the end of April. In both networks, although p produces high-frequency oscillations in the beginning, as shown in Figure 3, the metrics still pose some correspondences with the first confirmed cases.

Discussion
We present a complex network-based analysis in the Brazilian inter-cities mobility networks towards the identification of cities that are vulnerable to the SARS-CoV-2 spreading. The networks are built with the IBGE terrestrial mobility data from 2016 that have the flow of people between cities in a general/typical week. The cities are modeled as nodes and the flows as weighted edges. The geographical graphs, (geo)graphs, are visualized within Geographical Information Systems.
Cad. Saúde Pública 2020; 36(9):e00184820 Table 1 Cities with at least one case of COVID-19 in Brazil (Brazil without the North region) in the order they were documented 8 , side-by-side with the top-ranked cities regarding s, k and b for η 0 and η d . Notes: the best combination is s with η 0 (second column). Matching cities are colored alike.

Cities ordered according to
Two scales are investigated, the Brazilian cities without the North region, and the State of São Paulo. The former does not account for the North due to the high number of fluvial routes and some intrinsic local characteristics that are not represented with the terrestrial data. The State of São Paulo is crucial in the ongoing pandemic since the first documented case was in the state capital, and it is currently one of the main focuses of the virus spreading.
Three network measures are studied, namely the strength, degree, and betweenness centrality, under two flow thresholds to account for different mobility intensities, the original flow data and networks with only the edges with higher weights. Other network measures were preliminarily tested, including the weighted version of the betweenness centrality. However, the integrals of the correspondence curves of Figure 3 related to the strength, degree, and betweenness are higher than them. Besides, the three chosen measures capture the network properties in two different viewpoints: a local (strength and degree) and a global (betweenness) perspective. We verified that the strength has the best matching to the cities with COVID-19 confirmed cases. Moreover, the strength measure with the original flows showed to be the best option for Brazil. Differently, a more restricted threshold culminates in better correspondences at the beginning of the pandemic in São Paulo. Probably due to the interiorization of the spreading process, a transition is observed after a certain point, around the first days of May 2020, when the original flows have better results as the connections to smaller cities are only present when they are accounted for.
Regarding Table 1, the three measures capture some cities that do not appear in the first column, namely Fortaleza (Ceará State), Salvador (Bahia State), Campinas (São Paulo State), Ribeirão Preto (São Paulo State) and Belo Horizonte (Minas Gerais State). They soon had patients with COVID-19, Cad. Saúde Pública 2020; 36(9):e00184820 Table 2 Cities with at least one case of COVID-19 in the State of São Paulo in the order they were documented 8 , side-by-side with the top-ranked cities regarding s, k and b for η 0 and η d . though. Interestingly, the city of Feira de Santana (Bahia State) appears in all columns -it is the second-largest city of the state and connects the capital to the countryside of Bahia 16 . Oppositely, the city of João Pessoa, capital of Paraíba State does not appear in the top 20 of the second column (best measure -see Figure 3), but two other cities from the state do, namely Campina Grande and Patos. Campina Grande and Patos are among the five richest cities of Paraíba 17 . It should be noted that within the context of an epidemic, such cities are potential super spreaders along with the states' capitals. Five cities of Pernambuco State appear in the second column (best measure - Figure 3), namely Caruaru, Carpina, Limoeiro, Paudalho, and Recife. Pernambuco is currently ranked as the second state in the number of confirmed cases of the Northeast region 2 . Table 2, as in Table 1, also displays cities that are captured by the three rightmost columns that do not appear in the first, showing their high level of vulnerability: Ribeirão Preto, Jundiaí, Sorocaba, Piracicaba, and Presidente Prudente. They all have documented cases before June 4th. Our study also captured the most influential cities that had cases already in the beginning, like São Paulo, Campinas, São José do Rio Preto, São José dos Campos and Taubaté. Other cities appear in the second column (best metric) but not in the first: Praia Grande, São Vicente, São Carlos, Registro, Sertãozinho.
Due to their importance in mobility, many cities of Table 2, especially in the second column, appear in the report 15 on the vulnerability of microregions of São Paulo State to the SARS-CoV-2 pandemic of April 5th either as potential spreaders or places with a high probability of receiving new cases. They all have notified cases by June 4th and some have the highest numbers of the São Paulo State 18 .
Both s and b with η 0 pose good results at the beginning of the pandemics for the Brazil without North region network, but s alone started to be the best predictor from the end of April. The most important cities, due to their high flow of travelers and their role in the most used routes, are reached Cad. Saúde Pública 2020; 36(9):e00184820 first, followed by those with smaller flows, probably because of the interiorization of the virus -the outbreak reaching the countryside cities. This behavior is even more pronounced in São Paulo State, in which s under η d is the best option at first, neglecting lower flow venues, especially in April, but the η 0 started to be the best option from May onwards.
In the ongoing pandemics, from May 1st, the s index with η 0 is currently the best predictor and may help to figure out which countryside cities are about to receive new cases. Moreover, it may help in the following waves of the disease. In the case of another pandemic, one could first compute the strength of the networks according to the last updated data from IBGE and identify the top-ranked cities. In Brazil, it is enough checking on strength at the original data, as we presented, since it produces similar results as the betweenness centrality and is computationally cheaper to obtain. Regarding the State of São Paulo, one better checks on the strength index with threshold η d in the first weeks and only then switch to η 0 . As our results show, the correspondence has statistical significance and, along with other information about the regions such as where are the first notified cases, the pandemic could be closely traced.
It is worth mentioning that the COVID-19 data comes from the Brazilian Health Surveillance System, which is fed with data provided by each city in the country. The information update is a complex and dynamic process and there may be delays or errors in the data transfer. Moreover, considering the size and heterogeneity of Brazil, it is important to highlight that there are differences in the capacity to detect cases opportunely and in the quality of the information 19 . On the other hand, in late January 2020, almost a month before the first Brazilian COVID-19, the epidemiological surveillance guidelines and the National Contingency Plan for COVID-19 were published. One of the main objectives of these documents was to provide early guidance to the Brazilian Unified National Health System (SUS) service network, to act in the identification of COVID-19 cases 20 . Besides the limitations in the health surveillance system, there is a lack of information about possible intermediate stops between origins and destinations in the IBGE data, as it gives only the travelers' initial and final positions.
As future work, we intend to analyze fluvial and aerial mobility data as well, as they include valuable information about the transport of people and goods. The former is fundamental to the discussion of the dynamics for the Brazilian North region, especially the Amazon, and the latter captures long-range connections which are relevant in a possible future moment of re-emergence of the disease in the country, especially by foreign travelers. Lastly, one could check for correspondences between the measures of networks and data from other epidemics, and analyze control measures based on topological properties associated with the mobility network 21 .