Sampling design for the Birth in Brazil : National Survey into Labor and Birth

This paper describes the sample design for the National Survey into Labor and Birth in Brazil. The hospitals with 500 or more live births in 2007 were stratified into: the five Brazilian regions; state capital or not; and type of governance. They were then selected with probability proportional to the number of live births in 2007. An inverse sampling method was used to select as many days (minimum of 7) as necessary to reach 90 interviews in the hospital. Postnatal women were sampled with equal probability from the set of eligible women, who had entered the hospital in the sampled days. Initial sample weights were computed as the reciprocals of the sample inclusion probabilities and were calibrated to ensure that total estimates of the number of live births from the survey matched the known figures obtained from the Brazilian System of Information on Live Births. For the two telephone follow-up waves (6 and 12 months later), the postnatal woman’s response probability was modelled using baseline covariate information in order to adjust the sample weights for nonresponse in each follow-up wave. Sampling Studies; Stratified Sampling; Statistical Modeles; Parturition Resumo Este artigo descreve a amostra da Pesquisa Nacional sobre Parto e Nascimento no Brasil. Os hospitais com 500 ou mais nascidos vivos em 2007 foram estratificados por macrorregião, capital de estado ou não, e tipo, e selecionados com probabilidade proporcional ao número de nascidos-vivos em 2007. Amostragem inversa foi usada para selecionar tantos dias de pesquisa (mínimo de 7) quantos fossem necessários para alcançar 90 entrevistas realizadas com puérperas no hospital. As puérperas foram amostradas com igual probabilidade entre as elegíveis que entraram no hospital no dia. Os pesos amostrais básicos são o inverso do produto das probabilidades de inclusão em cada estágio e foram calibrados para assegurar que estimativas dos totais de nascidos vivos dos estratos correspondessem aos totais de nascidos vivos obtidos no SINASC. Para os dois seguimentos telefônicos (6 e 12 meses depois), a probabilidade de resposta das puérperas foi modelada pelas variáveis disponíveis na pesquisa de base, a fim de corrigir, para a não resposta, os pesos amostrais em cada onda de seguimento. Amostragem; Amostragem Estratificada; Modelos Estatísticos; Parto S1 QUESTÕES METODOLÓGICAS METHODOLOGICAL ISSUES http://dx.doi.org/10.1590/0102-311X00176013 Vasconcellos MTL et al. S2 Cad. Saúde Pública, Rio de Janeiro, 30 Sup:S1-S10, 2014 Introduction According to do Carmo Leal et al. 1 the objectives of the National Survey into Labour and Birth were: (1) to describe the incidence of excessive caesarean section (according to Robson’s groups) and examine the consequences on women’s and new-borns’ health; (2) to investigate the relationship between excessive caesarean section and late preterm birth and low birth weight; and (3) to investigate the relationship between excessive caesarean section and the use of technological procedures after birth. This article describes the sample design used in the survey including the definition of the survey population, the stratification of primary sampling units, the criteria for selection of hospitals, days and postnatal women, the base sample weights calculation and their calibration. It also describes the strategy used for estimating the response probabilities of respondents in the two additional telephone follow-up waves six and 12 months after the interview in the hospital, in order to calculate the sampling weights for the respondents in each follow-up wave. Survey population, first stage sampling frame and stratification The survey population 2 corresponds to the set of postnatal women who gave birth in 2011 in hospitals with 500 or more live births in 2007, according to the Information System on Live Births (SINASC. http://portal.saude.gov.br/portal/ saude/visualizar_texto.cfm?idtxt=21379). The SINASC was created by the Brazilian Department of Health in 1990 to gather epidemiological information on live births in hospitals and households all over the country. For operational reasons, a number of groups were excluded from the survey population including postnatal women with severe mental health disorders, those who were homeless or were foreigners who did not understand Portuguese, deaf/mutes, and women sectioned by court order. Given the survey population definition, only hospitals with 500 live births or more in 2007 were included in the first stage sampling frame. In the end 1.403 of the 3.961 hospitals registered in 2007 were eligible for the study, accounting for 2,228,534 (77.1%) of the 2,891,328 live births that year. In order to ensure different types of hospital governance (public, private and mixed) in all the five macro-regions of the country, divided into the set of state capitals and the other cities, which have important differences in dimension and kinds of health services, the hospitals in the first stage sampling frame were stratified by the combination of macro-region, capital or not and type of hospital governance, defining the strata presented in Table 1. Mixed governance was used for private hospitals that had beds contracted by the public sector. Sample size and its allocation by stratum According to do Carmo Leal et al. 1, the sample size in each stratum was calculated based on the caesarean section rate in Brazil in 2007 of 46.6%, with 5% significance to detect differences of 14% between public, mixed and private hospitals and power of 95%. The minimum sample per stratum was 341 postnatal women. Since the sample was clustered by hospital, a design effect of approximately 1.3 was used to inflate the initial sample sizes, leading to a minimum sample size of 450 postnatal women per stratum. Although not usual in sample survey, this way to determine sample size is common in clinical trials and randomized experiments. It derives from a two-tailed test of the hypothesis of equality between the proportions within treatment and control groups 3. For this calculation the expression 3.14 from Fleiss 4 was used. According to do Carmo Leal et al. 1, the sample size has a power of 80% to detect adverse outcomes in the order of 3%, and differences of at least 1.5% among large geographic regions or type of hospital governance (public/private/ mixed). Considering the minimum size of 450 postnatal women by stratum, it was decided to select at least five hospitals by stratum, leading to a sample size of 90 postnatal women by hospital. If an equal allocation among the strata were used, these parameters would lead to a sample size of 210 hospitals. However, a proportional allocation to the number of hospitals was used and conducted to a sample size of 266 hospitals, since in all strata with an allocated sample size smaller than five hospitals, the sample size was increased to five in order to ensure a minimum of five hospitals and 450 postnatal women, as indicated in Table 1.

ity proportional to the number of live births in 2007.An inverse sampling method was used to select as many days (minimum of 7) as necessary to reach 90 interviews in the hospital.Postnatal women were sampled with equal probability from the set of eligible women, who had entered the hospital in the sampled days.Initial sample weights were computed as the reciprocals of the sample inclusion probabilities and were calibrated to ensure that total estimates of the number of live births from the survey matched the known figures obtained from the Brazilian System of Information on Live Births.For the two telephone follow-up waves (6 and 12 months later), the postnatal woman's response probability was modelled using baseline covariate information in order to adjust the sample weights for nonresponse in each follow-up wave.

quantos fossem necessários para alcançar 90 entrevistas realizadas com puérperas no hospital. As puérperas foram amostradas com igual probabilidade entre as elegíveis que entraram no hospital no dia. Os pesos amostrais básicos são o inverso do produto das probabilidades de inclusão em cada estágio e foram calibrados para assegurar que estimativas dos totais de nascidos vivos dos estratos correspondessem aos totais de nascidos vivos obtidos no SINASC.
Para os dois seguimentos telefônicos (6 e 12 meses depois), a probabilidade de resposta das puérperas foi modelada pelas variáveis disponíveis na pesquisa de base, a fim de corrigir, para a não resposta, os pesos amostrais em cada onda de seguimento.Amostragem; Amostragem Estratificada; Modelos Estatísticos; Parto Introduction According to do Carmo Leal et al. 1 the objectives of the National Survey into Labour and Birth were: (1) to describe the incidence of excessive caesarean section (according to Robson's groups) and examine the consequences on women's and new-borns' health; (2) to investigate the relationship between excessive caesarean section and late preterm birth and low birth weight; and (3) to investigate the relationship between excessive caesarean section and the use of technological procedures after birth.
This article describes the sample design used in the survey including the definition of the survey population, the stratification of primary sampling units, the criteria for selection of hospitals, days and postnatal women, the base sample weights calculation and their calibration.It also describes the strategy used for estimating the response probabilities of respondents in the two additional telephone follow-up waves six and 12 months after the interview in the hospital, in order to calculate the sampling weights for the respondents in each follow-up wave.

Survey population, first stage sampling frame and stratification
The survey population 2 corresponds to the set of postnatal women who gave birth in 2011 in hospitals with 500 or more live births in 2007, according to the Information System on Live Births (SINASC.http://portal.saude.gov.br/portal/saude/visualizar_texto.cfm?idtxt=21379).The SINASC was created by the Brazilian Department of Health in 1990 to gather epidemiological information on live births in hospitals and households all over the country.
For operational reasons, a number of groups were excluded from the survey population including postnatal women with severe mental health disorders, those who were homeless or were foreigners who did not understand Portuguese, deaf/mutes, and women sectioned by court order.Given the survey population definition, only hospitals with 500 live births or more in 2007 were included in the first stage sampling frame.In the end 1.403 of the 3.961 hospitals registered in 2007 were eligible for the study, accounting for 2,228,534 (77.1%) of the 2,891,328 live births that year.
In order to ensure different types of hospital governance (public, private and mixed) in all the five macro-regions of the country, divided into the set of state capitals and the other cities, which have important differences in dimension and kinds of health services, the hospitals in the first stage sampling frame were stratified by the combination of macro-region, capital or not and type of hospital governance, defining the strata presented in Table 1.Mixed governance was used for private hospitals that had beds contracted by the public sector.

Sample size and its allocation by stratum
According to do Carmo Leal et al. 1 , the sample size in each stratum was calculated based on the caesarean section rate in Brazil in 2007 of 46.6%, with 5% significance to detect differences of 14% between public, mixed and private hospitals and power of 95%.The minimum sample per stratum was 341 postnatal women.Since the sample was clustered by hospital, a design effect of approximately 1.3 was used to inflate the initial sample sizes, leading to a minimum sample size of 450 postnatal women per stratum.
Although not usual in sample survey, this way to determine sample size is common in clinical trials and randomized experiments.It derives from a two-tailed test of the hypothesis of equality between the proportions within treatment and control groups 3 .For this calculation the expression 3.14 from Fleiss 4 was used.
According to do Carmo Leal et al. 1 , the sample size has a power of 80% to detect adverse outcomes in the order of 3%, and differences of at least 1.5% among large geographic regions or type of hospital governance (public/private/ mixed).
Considering the minimum size of 450 postnatal women by stratum, it was decided to select at least five hospitals by stratum, leading to a sample size of 90 postnatal women by hospital.If an equal allocation among the strata were used, these parameters would lead to a sample size of 210 hospitals.However, a proportional allocation to the number of hospitals was used and conducted to a sample size of 266 hospitals, since in all strata with an allocated sample size smaller than five hospitals, the sample size was increased to five in order to ensure a minimum of five hospitals and 450 postnatal women, as indicated in Table 1.

Hospital selection
In the first stage, the hospitals were selected with probability proportional to size (PPS), defined by number of live births of the hospital according to SINASC 2007.As usual in PPS selection, the hospitals with large numbers of live births (more than 13 per day on average, in this case) were included with certainty in the sample and treated as selection strata for sampling days and postnatal women.In the case of strata having five or less hospitals, a take-all procedure was used and each hospital was also treated as a selection stratum for the subsequent sampling stages.
The hospital selection was done systematically 5 , after sorting the hospitals in each stratum in ascending order by number of live births in 2007.The sample inclusion probabilities of hospitals are provided in expressions (1a) and (1b) of Figure 1.

Selection of survey days
In the second stage of sampling, an inverse sampling method 2,6 was used to select as many days as necessary to reach 90 postnatal women interviewed in the hospital.This method, originally proposed by Haldane 6 to estimate frequencies and proportions, can be defined as a technique to sample as many units (in this case, days) as needed to be observed in order to obtain a prespecified number of successes or, in this case, 90 interviews performed with postnatal women in the hospital.
It is called inverse sampling because rather than defining a fixed number of days sufficient to have an expected sample size of 90 interviews as done by Veloso et al. 7 , it defines the number of * Two private hospitals sampled in non-capital cities of the Northeast region could not take part in the study and could not be replaced.

Sample probability scheme
Denoting the selection stratum by h, the hospital by i, the survey day by j and the postnatal woman by k, the inclusion probability of any postnatal woman is equal to the product of the inclusion probability of hospital i, represented by ; the conditional probability of inclusion of a survey day j given the selection of hospital i, represented by and the conditional probability of postnatal woman k in day j and hospital i, represented by These probabilities are expressed as follows: (3) , where h N is the number of hospitals with 500 or more live births in 2007 in selection stratum h, as indicated in Table 1; is the sample size of hospitals in selection stratum h; Thus, the inclusion probability of any postnatal woman, represented by , is given by expression (4): ( 4) The base sample weight to be used in postnatal woman estimation, represented by , corresponds to the reciprocal of the probability given in (4), as shown in expression ( 5): (5) The calibrated sample weights, represented by , are given by: (6) , where h TB is the total number of live births in selection stratum h, observed in SINASC 2011; and h EB is the estimated number of live births in selection stratum h, using base sample weight.
interviews performed as the stopping rule of the consecutive sample of survey days.The first survey day in each hospital was always selected with equal probability during the year, as indicated by expression (2) of Figure 1.The -1 in the numerator and denominator in expression (2) are explained by the loss of one degree of freedom due to the stopping rule, as defined by Haldane 6 .
To account for the difference of number of live births in weekends and work days, a minimum of seven consecutive days was mandatory and the size of field team was determined to ensure this rule.

Selection of postnatal women
The number of postnatal women to be selected per day and hospital depended on the number of live births and the numbers of interview shifts and interviewers per day in the hospital.To establish the number of shifts and interviewers, the mean number of live births per day per hospital in 2007 was used and four combinations were defined: (1) one interviewer and one shift for four interviews; (2) one interviewer and two shifts for six interviews; (3) two interviewers and one shift for eight interviews; and (4) two interviewers and two shifts for twelve interviews.
To ensure a random selection of postnatal women, the survey central office has prepared tables with the number of order of the women to be interviewed according to the numbers of live births (up to 40) and interviews per day and hospital (4, 6, 8 and 12).The number of order of the postnatal women was defined by the order of entrance in the hospital.Some additional numbers of order have been selected for replacement of non-responses.
Unfortunately, the number of live births per hospital and survey day were not recorded during the field work.To overcome this problem, the SINASC 2011 and 2012 files were processed to determine the number of live births in each hospital and survey day, as required to calculate the inclusion probabilities described in expression (3) of Figure 1.

Treatment of non-responses
Nine sampled hospitals refused to take part in the survey, and three had the maternity service closed prior to the start of the fieldwork.The established replacement procedure for hospital non-response consisted in replacing the nonresponding hospital by the next hospital in the stratum, according to the sort order of hospitals in the first stage sampling frame.Despite this, it was not possible to replace two non-responding hospitals among private hospitals located in non-capital cities in the Northeast region, as indicated in Table 1.
Postnatal women's non-response was treated, if possible, by replacement according to selection tables prepared for each hospital or by the inverse sampling procedure used in survey day selection (more days added to the sample until 90 complete interviews were achieved per hospital).In the case of closure of the maternity service during the field work, the inverse sampling procedure was interrupted, restarting as soon as the maternity service was open.
A total of 1,356 (5.7%) postnatal women selected were replaced, 15% due to early hospital discharge and 85% due to refusal to participate.The sample size was composed of 23,940 postnatal women interviewed in 266 hospitals.During processing, records with no data from the woman or no new-born medical records were excluded and the final sample size accounted for 23,894 postnatal women (Table 1).

Sample weighting and calibration of sample weights
As indicated in Figure 1, the base sample weights were calculated by the reciprocals of the product of the inclusion probabilities in each sampling stage.
As usual in official statistical surveys (according to Silva 8 ), calibration of the base sample weights was performed to enforce coherence between sample estimates and known population totals obtained from an external source.In addition, up to a point, calibration helps to compensate for potential sampling and nonresponse biases.
Since the field work was conducted in 2011 (and at the beginning of 2012 for a few hospitals), it seemed appropriate to keep the coherence between sample based estimates and the total number of live births as obtained from the SINASC 2011 for the hospitals in the sampling frame, i.e. those with more than 500 live births in 2007.
For this reason, a ratio type calibration procedure of the base sample weights was performed within each of the selection strata, as indicated in expression (6) of Figure 1.
Results comparing population data with estimates obtained using both the base and calibrated sample weights are presented in Table 2.These results show the coherence between estimates based on calibrated weights and the known population totals, as expected.Also as expected, calibration leads to a slight increase in the variation of the sample weights as shown in Table 3.This increase in sample weight variation is the price to assure coherence for estimates.

Sample weights for the two telephone follow-up waves
As expected, it was not possible to contact all postnatal women interviewed in the baseline survey during the two telephone interview follow-up waves.Some possibilities could be used to correct the non-response: (1) probabilistic imputation of non-respondents' data; (2) treating the responding sample as a subsample of the baseline sample; or (3) modelling the probability of response in each follow-up wave as a func-tion of some covariates obtained in the baseline survey and using these to derive nonresponse weight adjustments for responding women in each follow-up wave.
Considering the information on responses achieved in each follow-up wave as provided in Table 3, note that 67.4% and 49.9% of the women interviewed in the baseline survey responded in the first and second follow-up waves respectively.Due to the high nonresponse rates, the first two options were not considered suitable alternatives for nonresponse compensation.
Thus the solution adopted was to model the response probabilities using the covariate information available from the baseline survey.The procedure used was proposed by Little 9 , and is also described in Lepkowski 10 and Brick & Montaquila 11 .The general idea behind the procedure used to obtain the sample weights in each telephone interview follow-up wave can be described in four steps, as presented in Figure 2.
In the first step, a model was fitted to explain the probability of responding to each follow-up wave for each postnatal woman in the baseline sample using the baseline covariate information as well as the follow-up wave response indicator.This procedure was applied independently for each follow-up wave.
In the second step, the predicted values of the response probabilities in each follow-up wave were estimated using the model fitted in step one.
In the third step, for each follow-up wave the quintiles of the predicted response probabilities were used to define five weight adjustment classes in which a response rate was estimated by the ratio of the sum of respondents' baseline calibrated sample weights to the total of baseline calibrated sample weights of postnatal women of the class, as indicated by expression (9) of Figure 2.
In the last step, the reciprocals of the response rates estimated by follow-up wave and weight adjustment class were used to adjust the baseline calibrated sample weights of the postnatal women interviewed in each follow-up wave.
For the models of response probability, the set of potential predictor variables initially considered included: macro-region; located in capital city or not; type of hospital governance; postnatal woman's socioeconomic class (A+B, C, or D+E), delivery payment (public, private health insurance, or directly out of pocket), postnatal woman age class (12-19 years, 20-34 years, and 35 years or more); "Have you got any work where you get paid?"(yes or no); "Were you satisfied with your pregnancy at its beginning?"(yes or no); "Still birth or neonatal death of child?" (yes or no); race or skin color (white, black, brown, yellow, or indigenous); "Were there obstetric complications during gestation leading to negative perinatal outcomes?"(yes or no); and for the second follow-up wave only, has the woman responded to the first follow-up wave (yes or no).
For the first follow-up wave, the significant predictor variables were the three variables that defined sample strata (macro-region, capital or not and type of hospital governance), postnatal woman's socioeconomic class and postnatal woman's age class.
For the second follow-up wave the significant variables were the same five variables listed above plus "Have you got any work where you get paid?","Were you satisfied with your pregnancy at its beginning?"and "Still birth or neonatal death of child?".
In the correction of follow-up sample weight (third step), the predicted response probabilities were not used directly to adjust the baseline calibrated sample weights in each follow-up wave to avoid undesirable variation in the final weights.In fact, Kish 12 demonstrates that sample weights may reduce bias but often increase the variance of weighted estimators, since the ratio between the variance of the weighted estimator and the variance of the corresponding un-weighted estimator is equal to 1 plus the square of the coefficient of variation of the sample weights.Thus Modelando probabilidades de resposta para calcular ajustes para os pesos dos dois seguimentos.
Denoting by k j i h P the response probability of any postnatal woman in a follow-up wave.The logistic model in expression (7) was fitted using the baseline calibrated sample weights: (7) is the vector of values of the relevant baseline predictor variables for postnatal women k in the survey day j in hospital i from selection stratum h; and β is the vector of model parameters.The predicted value of the response probability is provided in expression (8) e 1 e P ˆ+ = .Since using the reciprocal of the predicted response probability as nonresponse adjustment weight in the follow-up wave might lead to a large variation in the follow-up wave final weights, a more robust strategy was applied.Firstly, the four quintiles of the predicted response probabilities were used to create five weight adjustment classes.Within each weight adjustment class q, a response rate was estimated as the ratio of the total of respondents' baseline calibrated sample weights and the total of the class baseline calibrated sample weights, as indicated in expression (9) is the calibrated sample weight of postnatal women k in the survey day j in hospital i from selection stratum h; and 1) I(M k j i h = is the dummy variable that indicates if postnatal women k in the survey day j in hospital i from selection stratum h has responded the telephone interview follow-up wave.Finally, the reciprocal of this estimated response rate for the weight adjustment class is used as a multiplicative adjustment factor for the baseline calibrated sample weight of each responding woman in the class, leading to a final wave specific weight represented by q k j i h w , as indicated in expression (10).(10)   q c k j i h q k j i h r ŵ w = .
These steps were performed independently for each follow-up wave, always using the baseline information available.
the solution in the third and fourth steps leads to a better solution in correcting the follow-up sample weights for nonresponse, while keeping the increase in weight variation to a minimum (Table 3).

T
is the size measure associated to hospital i from selection stratum h, defined as the number of live births in hospital i from selection stratum h; h T is the sum of sizes of all hospitals from selection stratum h, i.e, sample size of survey days in hospital i from selection stratum h, which was predefined as at least seven days per hospital; j i h n is the effective sample size of postnatal women in the survey day j in hospital i from selection stratum h; and j i h N is the total number of live births in survey day j in hospital i from selection stratum h, observed in SINASC 2011 and 2012.

Table 1
Number of live births and hospitals in survey population and sample size, according to strata.

Table 2
Number of live births in the survey population and estimated number of live births obtained by base and calibrated weights, according to macro-regions and type of hospital governance.

Table 3
Summary statistics of base and calibrated sample weight distributions.
Este artículo describe la muestra de la Encuesta Nacional sobre Partos y Nacimientos en Brasil.Los hospitales con 500 o más nacimientos en 2007 fueron estratificados por región, capital del estado o no, y tipo, y se seleccionan con probabilidad proporcional al número de nacidos vivos en 2007.Se utilizó un muestreo inverso para seleccionar los días de encuesta (mínimo 7), con el fin de lograr 90 entrevistas en el hospital.Se realizó el muestreo de las mujeres posparto, con igual probabilidad entre las mujeres elegibles que entraron en el hospital cada día.Los pesos iniciales son el inverso del pro-ducto de las probabilidades de inclusión en cada etapa y se calibraron para asegurar que las estimaciones del total de nacidos vivos correspondieran al total de nacidos vivos, obtenidos a partir de SINASC.Para los dos seguimientos telefónicos (6 y 12 meses después), la probabilidad de respuesta de lãs mujeres posparto fue modelada a partir de variables disponibles en la investigación básica, a fin de corregir, debido a la no-respuesta, los pesos de la muestra de cada ola de seguimiento. .T. L. Vasconcellos and P. L. N. Silva prepared the sample weighting procedures and prepared the first version of the manuscript, which was modified and approved by all authors.A. P. E. Pereira and A. O. C. Schilithz selected the sample, calculated and calibrated the sample weights, and approved the manuscript.P. R. B. Souza Junior and C. L. Szwarcwald designed the sample and approved the manuscript. M