## Serviços Personalizados

## Journal

## Artigo

## Indicadores

## Links relacionados

- Citado por Google
- Similares em SciELO
- Similares em Google

## Compartilhar

## Cadernos de Saúde Pública

##
*versão impressa* ISSN 0102-311X

### Cad. Saúde Pública vol.30 supl.1 Rio de Janeiro 2014

#### http://dx.doi.org/10.1590/0102-311X00176013

METHODOLOGICAL ISSUES

Sampling design for the Birth in Brazil: National Survey into Labor and Birth

^{1}Escola Nacional de Ciências Estatística,
Instituto Brasileiro de Geografia e Estatística, Rio de Janeiro,
Brasil.

^{2}Escola Nacional de Saúde Pública Sergio Arouca,
Fundação Oswaldo Cruz, Rio de Janeiro, Brasil.

^{3}Instituto de Comunicação e Informação
Científica e Tecnológica em Saúde, Fundação Oswaldo Cruz, Rio de Janeiro,
Brasil.

This paper describes the sample design for the National Survey into Labor and Birth in Brazil. The hospitals with 500 or more live births in 2007 were stratified into: the five Brazilian regions; state capital or not; and type of governance. They were then selected with probability proportional to the number of live births in 2007. An inverse sampling method was used to select as many days (minimum of 7) as necessary to reach 90 interviews in the hospital. Postnatal women were sampled with equal probability from the set of eligible women, who had entered the hospital in the sampled days. Initial sample weights were computed as the reciprocals of the sample inclusion probabilities and were calibrated to ensure that total estimates of the number of live births from the survey matched the known figures obtained from the Brazilian System of Information on Live Births. For the two telephone follow-up waves (6 and 12 months later), the postnatal woman’s response probability was modelled using baseline covariate information in order to adjust the sample weights for nonresponse in each follow-up wave.

**Key words: **Sampling Studies; Stratified Sampling; Statistical Modeles; Parturition

Introduction

According to do Carmo Leal et al. ^{1} the objectives of the National Survey into Labour and
Birth were: (1) to describe the incidence of excessive caesarean section
(according to Robson’s groups) and examine the consequences on women’s and
new-borns’ health; (2) to investigate the relationship between excessive
caesarean section and late preterm birth and low birth weight; and (3) to
investigate the relationship between excessive caesarean section and the use of
technological procedures after birth.

This article describes the sample design used in the survey including the definition of the survey population, the stratification of primary sampling units, the criteria for selection of hospitals, days and postnatal women, the base sample weights calculation and their calibration. It also describes the strategy used for estimating the response probabilities of respondents in the two additional telephone follow-up waves six and 12 months after the interview in the hospital, in order to calculate the sampling weights for the respondents in each follow-up wave.

Survey population, first stage sampling frame and stratification

The survey population ^{2}
corresponds to the set of postnatal women who gave birth in 2011 in hospitals
with 500 or more live births in 2007, according to the Information System on
Live Births (SINASC.
http://portal.saude.gov.br/portal/saude/visualizar_texto.cfm?idtxt=21379). The
SINASC was created by the Brazilian Department of Health in 1990 to gather
epidemiological information on live births in hospitals and households all over
the country.

For operational reasons, a number of groups were excluded from the survey population including postnatal women with severe mental health disorders, those who were homeless or were foreigners who did not understand Portuguese, deaf/mutes, and women sectioned by court order. Given the survey population definition, only hospitals with 500 live births or more in 2007 were included in the first stage sampling frame. In the end 1.403 of the 3.961 hospitals registered in 2007 were eligible for the study, accounting for 2,228,534 (77.1%) of the 2,891,328 live births that year.

In order to ensure different types of hospital governance (public, private and mixed) in all the five macro-regions of the country, divided into the set of state capitals and the other cities, which have important differences in dimension and kinds of health services, the hospitals in the first stage sampling frame were stratified by the combination of macro-region, capital or not and type of hospital governance, defining the strata presented in Table 1. Mixed governance was used for private hospitals that had beds contracted by the public sector.

Macro-regions and hospital type of governance | Total | State capitals | Non-capitals | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Live births in 2007 | Hospitals in 2007 | Hospital sample size | Effective sample size of women | Live births in 2007 | Hospitals in 2007 | Hospital sample size | Effective sample size of women | Live births in 2007 | Hospitals in 2007 | Hospital sample size | Effective sample size of women | |

Total | 2,228,534 | 1,403 | 266 | 23,894 | 802,543 | 308 | 84 | 7,551 | 1,425,991 | 1,095 | 182 | 16,343 |

Public | 932,617 | 531 | 95 | 8,537 | 412,069 | 137 | 30 | 2,699 | 520,548 | 394 | 65 | 5,838 |

Mixed | 966,190 | 649 | 115 | 10,330 | 186,580 | 61 | 24 | 2,157 | 779,610 | 588 | 91 | 8,173 |

Private | 329,727 | 223 | 56 | 5,027 | 203,894 | 110 | 30 | 2,695 | 125,833 | 113 | 26 | 2,332 |

North | ||||||||||||

Public | 136,987 | 91 | 17 | 1,531 | 57,320 | 14 | 5 | 448 | 79,667 | 77 | 12 | 1,083 |

Mixed | 74,641 | 47 | 10 | 899 | 31,366 | 12 | 5 | 450 | 43,275 | 35 | 5 | 449 |

Private | 10,721 | 9 | 5 | 450 | 10,721 | 9 | 5 | 450 | 0 | 0 | 0 | 0 |

Northeast | ||||||||||||

Public | 341,638 | 211 | 31 | 2,779 | 141,079 | 44 | 6 | 538 | 200,559 | 167 | 25 | 2,241 |

Mixed | 273,815 | 160 | 28 | 2,516 | 51,892 | 17 | 5 | 450 | 221,923 | 143 | 23 | 2,066 |

Private * | 46,213 | 31 | 9 | 801 | 42,502 | 26 | 6 | 539 | 3,711 | 5 | 3 | 262 |

Southeast | ||||||||||||

Public | 313,853 | 155 | 26 | 2,341 | 141,235 | 53 | 8 | 722 | 172,618 | 102 | 18 | 1,619 |

Mixed | 402,730 | 273 | 42 | 3,776 | 61,976 | 14 | 5 | 452 | 340,754 | 259 | 37 | 3,324 |

Private | 213,047 | 136 | 21 | 1,888 | 113,219 | 51 | 8 | 718 | 99,828 | 85 | 13 | 1,170 |

South | ||||||||||||

Public | 74,770 | 36 | 11 | 991 | 31,126 | 10 | 6 | 541 | 43,644 | 26 | 5 | 450 |

Mixed | 156,559 | 130 | 24 | 2,159 | 15,384 | 4 | 4 | 360 | 141,175 | 126 | 20 | 1,799 |

Private | 40,141 | 31 | 11 | 989 | 22,947 | 13 | 6 | 539 | 17,194 | 18 | 5 | 450 |

Central | ||||||||||||

Public | 65,369 | 38 | 10 | 895 | 41,309 | 16 | 5 | 450 | 24,060 | 22 | 5 | 445 |

Mixed | 58,445 | 39 | 11 | 980 | 25,962 | 14 | 5 | 445 | 32,483 | 25 | 6 | 535 |

Private | 19,605 | 16 | 10 | 899 | 14,505 | 11 | 5 | 449 | 5,100 | 5 | 5 | 450 |

^{*}Two private hospitals sampled in non-capital cities of the
Northeast region could not take part in the study and could not
be replaced.

Sample size and its allocation by stratum

According to do Carmo Leal et al. ^{1}, the sample size in each stratum was calculated based
on the caesarean section rate in Brazil in 2007 of 46.6%, with 5% significance
to detect differences of 14% between public, mixed and private hospitals and
power of 95%. The minimum sample per stratum was 341 postnatal women. Since the
sample was clustered by hospital, a design effect of approximately 1.3 was used
to inflate the initial sample sizes, leading to a minimum sample size of 450
postnatal women per stratum.

Although not usual in sample survey, this way to determine sample size is common
in clinical trials and randomized experiments. It derives from a two-tailed test
of the hypothesis of equality between the proportions within treatment and
control groups ^{3}. For this
calculation the expression 3.14 from Fleiss ^{4} was used.

According to do Carmo Leal et al. ^{1}, the sample size has a power of 80% to detect adverse
outcomes in the order of 3%, and differences of at least 1.5% among large
geographic regions or type of hospital governance (public/private/mixed).

Considering the minimum size of 450 postnatal women by stratum, it was decided to select at least five hospitals by stratum, leading to a sample size of 90 postnatal women by hospital. If an equal allocation among the strata were used, these parameters would lead to a sample size of 210 hospitals. However, a proportional allocation to the number of hospitals was used and conducted to a sample size of 266 hospitals, since in all strata with an allocated sample size smaller than five hospitals, the sample size was increased to five in order to ensure a minimum of five hospitals and 450 postnatal women, as indicated in Table 1.

Hospital selection

In the first stage, the hospitals were selected with probability proportional to size (PPS), defined by number of live births of the hospital according to SINASC 2007. As usual in PPS selection, the hospitals with large numbers of live births (more than 13 per day on average, in this case) were included with certainty in the sample and treated as selection strata for sampling days and postnatal women. In the case of strata having five or less hospitals, a take-all procedure was used and each hospital was also treated as a selection stratum for the subsequent sampling stages.

The hospital selection was done systematically ^{5}, after sorting the hospitals in each stratum in
ascending order by number of live births in 2007. The sample inclusion
probabilities of hospitals are provided in expressions (1a) and (1b) of Figure 1.

Selection of survey days

In the second stage of sampling, an inverse sampling method ^{2}^{,}^{6} was used to select as many days as necessary to
reach 90 postnatal women interviewed in the hospital. This method, originally
proposed by Haldane ^{6} to
estimate frequencies and proportions, can be defined as a technique to sample as
many units (in this case, days) as needed to be observed in order to obtain a
pre-specified number of successes or, in this case, 90 interviews performed with
postnatal women in the hospital.

It is called inverse sampling because rather than defining a fixed number of days
sufficient to have an expected sample size of 90 interviews as done by Veloso et
al. ^{7}, it defines the number of
interviews performed as the stopping rule of the consecutive sample of survey
days. The first survey day in each hospital was always selected with equal
probability during the year, as indicated by expression (2) of Figure 1. The -1 in the numerator and
denominator in expression (2) are explained by the loss of one degree of freedom
due to the stopping rule, as defined by Haldane ^{6}.

To account for the difference of number of live births in weekends and work days, a minimum of seven consecutive days was mandatory and the size of field team was determined to ensure this rule.

Selection of postnatal women

The number of postnatal women to be selected per day and hospital depended on the number of live births and the numbers of interview shifts and interviewers per day in the hospital. To establish the number of shifts and interviewers, the mean number of live births per day per hospital in 2007 was used and four combinations were defined: (1) one interviewer and one shift for four interviews; (2) one interviewer and two shifts for six interviews; (3) two interviewers and one shift for eight interviews; and (4) two interviewers and two shifts for twelve interviews.

To ensure a random selection of postnatal women, the survey central office has prepared tables with the number of order of the women to be interviewed according to the numbers of live births (up to 40) and interviews per day and hospital (4, 6, 8 and 12). The number of order of the postnatal women was defined by the order of entrance in the hospital. Some additional numbers of order have been selected for replacement of non-responses.

Unfortunately, the number of live births per hospital and survey day were not recorded during the field work. To overcome this problem, the SINASC 2011 and 2012 files were processed to determine the number of live births in each hospital and survey day, as required to calculate the inclusion probabilities described in expression (3) of Figure 1.

Treatment of non-responses

Nine sampled hospitals refused to take part in the survey, and three had the maternity service closed prior to the start of the fieldwork. The established replacement procedure for hospital non-response consisted in replacing the non-responding hospital by the next hospital in the stratum, according to the sort order of hospitals in the first stage sampling frame. Despite this, it was not possible to replace two non-responding hospitals among private hospitals located in non-capital cities in the Northeast region, as indicated in Table 1.

Postnatal women’s non-response was treated, if possible, by replacement according to selection tables prepared for each hospital or by the inverse sampling procedure used in survey day selection (more days added to the sample until 90 complete interviews were achieved per hospital). In the case of closure of the maternity service during the field work, the inverse sampling procedure was interrupted, restarting as soon as the maternity service was open.

A total of 1,356 (5.7%) postnatal women selected were replaced, 15% due to early hospital discharge and 85% due to refusal to participate. The sample size was composed of 23,940 postnatal women interviewed in 266 hospitals. During processing, records with no data from the woman or no new-born medical records were excluded and the final sample size accounted for 23,894 postnatal women (Table 1).

Sample weighting and calibration of sample weights

As indicated in Figure 1, the base sample weights were calculated by the reciprocals of the product of the inclusion probabilities in each sampling stage.

As usual in official statistical surveys (according to Silva ^{8}), calibration of the base
sample weights was performed to enforce coherence between sample estimates and
known population totals obtained from an external source. In addition, up to a
point, calibration helps to compensate for potential sampling and nonresponse
biases.

Since the field work was conducted in 2011 (and at the beginning of 2012 for a few hospitals), it seemed appropriate to keep the coherence between sample based estimates and the total number of live births as obtained from the SINASC 2011 for the hospitals in the sampling frame, i.e. those with more than 500 live births in 2007.

For this reason, a ratio type calibration procedure of the base sample weights was performed within each of the selection strata, as indicated in expression (6) of Figure 1.

Results comparing population data with estimates obtained using both the base and calibrated sample weights are presented in Table 2. These results show the coherence between estimates based on calibrated weights and the known population totals, as expected. Also as expected, calibration leads to a slight increase in the variation of the sample weights as shown in Table 3. This increase in sample weight variation is the price to assure coherence for estimates.

Macro-regions and type of hospital governance | Population data from SINASC 2011 | Base sample weight | Calibrated sample weight | ||
---|---|---|---|---|---|

Estimate | Relative error (%) * | Estimate | Relative error (%) * | ||

Total | 2,337,476 | 2,697,463 | 15.4 | 2,337,476 | 0.0 |

Public | 962,273 | 1,058,939 | 10.0 | 962,273 | 0.0 |

Mixed | 1,036,634 | 1,170,514 | 12.9 | 1,036,634 | 0.0 |

Private | 338,569 | 468,010 | 38.2 | 338,569 | 0.0 |

North | |||||

Public | 154,305 | 161,788 | 4.8 | 154,305 | 0.0 |

Mixed | 57,571 | 83,284 | 44.7 | 57,571 | 0.0 |

Private | 12,690 | 13,430 | 5.8 | 12,690 | 0.0 |

Northeast | |||||

Public | 334,541 | 376,493 | 12.5 | 334,541 | 0.0 |

Mixed | 230,107 | 360,287 | 56.6 | 230,107 | 0.0 |

Private | 110,702 | 67,497 | -39.0 | 110,702 | 0.0 |

Southeast | |||||

Public | 337,772 | 362,600 | 7.4 | 337,772 | 0.0 |

Mixed | 501,644 | 458,582 | -8.6 | 501,644 | 0.0 |

Private | 154,042 | 296,744 | 92.6 | 154,042 | 0.0 |

South | |||||

Public | 66,793 | 75,919 | 13.7 | 66,793 | 0.0 |

Mixed | 182,224 | 197,981 | 8.6 | 182,224 | 0.0 |

Private | 42,932 | 67,762 | 57.8 | 42,932 | 0.0 |

Central | |||||

Public | 68,862 | 82,139 | 19.3 | 68,862 | 0.0 |

Mixed | 65,088 | 70,381 | 8.1 | 65,088 | 0.0 |

Private | 18,203 | 22,577 | 24.0 | 18,203 | 0.0 |

^{*}Relative error (%) = (Estimate – population data) x
100/population data.

Summary statistic | Base sample weight | Calibrated sample weight | 1st follow-up wave sample weight | 2nd follow-up wave sample weight |
---|---|---|---|---|

Number of observations | 23,894 | 23,894 | 16,109 | 11,925 |

Minimum | 7.4 | 4.5 | 6.0 | 7.0 |

First quartile (Q1) | 69.4 | 55.3 | 76.8 | 103.3 |

Median | 96.1 | 78.6 | 119.0 | 162.6 |

Third quartile (Q3) | 132.6 | 114.8 | 175.5 | 255.2 |

Maximum | 3,499.9 | 4,194.9 | 3,870.4 | 7,395.8 |

Range (maximum – minimum) | 3,492.5 | 4,190.4 | 3,864.4 | 7,388.8 |

Interquartile range (Q3 – Q1) | 63.2 | 59.5 | 98.7 | 151.9 |

Mode | 19.3 | 14.9 | 29.6 | 39.5 |

Mean | 112.9 | 97.8 | 149.1 | 211.0 |

Standard deviation | 97.6 | 97.0 | 151.5 | 222.4 |

Coefficient of variation (%) | 86.4 | 99.2 | 101.6 | 105.4 |

Sample weights for the two telephone follow-up waves

As expected, it was not possible to contact all postnatal women interviewed in the baseline survey during the two telephone interview follow-up waves. Some possibilities could be used to correct the non-response: (1) probabilistic imputation of non-respondents’ data; (2) treating the responding sample as a subsample of the baseline sample; or (3) modelling the probability of response in each follow-up wave as a function of some covariates obtained in the baseline survey and using these to derive nonresponse weight adjustments for responding women in each follow-up wave.

Considering the information on responses achieved in each follow-up wave as provided in Table 3, note that 67.4% and 49.9% of the women interviewed in the baseline survey responded in the first and second follow-up waves respectively. Due to the high nonresponse rates, the first two options were not considered suitable alternatives for nonresponse compensation.

Thus the solution adopted was to model the response probabilities using the
covariate information available from the baseline survey. The procedure used was
proposed by Little ^{9}, and is
also described in Lepkowski ^{10}
and Brick & Montaquila ^{11}.

The general idea behind the procedure used to obtain the sample weights in each telephone interview follow-up wave can be described in four steps, as presented in Figure 2.

In the first step, a model was fitted to explain the probability of responding to each follow-up wave for each postnatal woman in the baseline sample using the baseline covariate information as well as the follow-up wave response indicator. This procedure was applied independently for each follow-up wave.

In the second step, the predicted values of the response probabilities in each follow-up wave were estimated using the model fitted in step one.

In the third step, for each follow-up wave the quintiles of the predicted response probabilities were used to define five weight adjustment classes in which a response rate was estimated by the ratio of the sum of respondents’ baseline calibra-ted sample weights to the total of baseline calibrated sample weights of postnatal women of the class, as indicated by expression (9) of Figure 2.

In the last step, the reciprocals of the response rates estimated by follow-up wave and weight adjustment class were used to adjust the baseline calibrated sample weights of the postnatal women interviewed in each follow-up wave.

For the models of response probability, the set of potential predictor variables
initially considered included: macro-region; located in capital city or not;
type of hospital governance; postnatal woman’s socioeconomic class (A+B, C, or
D+E), delivery payment (public, private health insurance, or directly out of
pocket), postnatal woman age class (12-19 years, 20-34 years, and 35 years or
more); “*Have you got any work where you get paid?*” (yes or no);
“*Were you satisfied with your pregnancy at its beginning?*”
(yes or no); “*Still birth or neonatal death of child?*” (yes or
no); race or skin color (white, black, brown, yellow, or indigenous);
“*Were there obstetric complications during gestation leading to
negative perinatal outcomes?*” (yes or no); and for the second
follow-up wave only, has the woman responded to the first follow-up wave (yes or
no).

For the first follow-up wave, the significant predictor variables were the three variables that defined sample strata (macro-region, capital or not and type of hospital governance), postnatal woman’s socioeconomic class and postnatal woman’s age class.

For the second follow-up wave the significant variables were the same five
variables listed above plus “*Have you got any work where you get
paid?*”, “*Were you satisfied with your pregnancy at its
beginning?” and “Still birth or neonatal death of child?*”.

In the correction of follow-up sample weight (third step), the predicted response
probabilities were not used directly to adjust the baseline calibrated sample
weights in each follow-up wave to avoid undesirable variation in the final
weights. In fact, Kish ^{12}
demonstrates that sample weights may reduce bias but often increase the variance
of weighted estimators, since the ratio between the variance of the weighted
estimator and the variance of the corresponding un-weighted estimator is equal
to 1 plus the square of the coefficient of variation of the sample weights. Thus
the solution in the third and fourth steps leads to a better solution in
correcting the follow-up sample weights for nonresponse, while keeping the
increase in weight variation to a minimum (Table 3).

Acknowledgments

To the regional and state coordinators, supervisors, interviewers and crew of the study and the mothers who participated and made this study possible.

REFERENCES

do Carmo Leal M, da Silva AA, Dias MA, da Gama SG, Rattner D, Moreira ME, et al. Birth in Brazil: national survey into labour and birth. Reprod Health 2012; 9:15. [ Links ]

2.
Cochran WG. Sampling techniques. 3^{rd }Ed. New York: John
Wiley & Sons; 1977.
[ Links ]

Altman DG. Practical statistics for medical research. London: Chapman and Hall, 1991. [ Links ]

4.
Fleiss JL. Statistical methods for rates and proportions,
2^{nd} Ed. New York: John Wiley & Sons; 1981.
[ Links ]

Madow WG. On the theory of systematic sampling, II. Annals of Mathematical Statistics 1949; 20: 333-54. [ Links ]

Haldane JBS. On a method of estimating frequencies. Biometrika 1945; 33:222-5. [ Links ]

Veloso VG, Portela MC, Vasconcellos MTL, Matzenbacher LA, Vasconcelos ALR, Grinsztejn B, et al. HIV testing among pregnant women in Brazil: rates and predictors. Rev Saúde Pública 2008; 42:859-67. [ Links ]

Silva PLN. Calibration estimation: when and why, how much and how. Rio de Janeiro: Instituto Brasileiro de Geografia e Estatística; 2004. (Textos para Discussão da Diretoria de Pesquisas, 14). [ Links ]

Little RJ. Survey nonresponse adjustments. International Statistical Review 1986; 54:139-57. [ Links ]

Lepkowski J. Non-observation error in household surveys in developing countries. In: Department of Economic and Social Affairs, Statistics Division, editor. Household surveys in developing and transition countries. New York: United Nations; 2005. p. 149-69. (Series F, 96). [ Links ]

Brick JM, Montaquila JM. Nonresponse and weighting, In: Pfeffermann D, Rao CR, editors. Handbook of statistics 29A. Sample surveys: design, methods and applications. Philadelphia: Elsevier; 2009. p. 163-85. [ Links ]

Kish L. Weigthing for unequal Pi. Journal of Official Statistics 1992; 8:183-200. [ Links ]

Funding

National Council for Scientific and Technological Development (CNPq); Science and Tecnology Department, Secretariat of Science, Tecnology, and Strategic Inputs, Brazilian Ministry of Health; National School of Public Health, Oswaldo Cruz Foundation (INOVA Project); and Foundation for supporting Research in the State of Rio de Janeiro (Faperj).

Received: October 09, 2013; Revised: February 26, 2014; Accepted: March 24, 2014