|ABSTRACT||In Brazil the challenge of meeting the needs of those living in deprived areas has generated discussions on replacing the existing approach to epidemiological surveillance with an integrated public health surveillance system. This new approach would supplant the traditional focus on high-risk individuals with a method for identifying high-risk populations and the areas where these persons live. Given the magnitude of the problem that tuberculosis (TB) poses for Brazil, we chose that disease as an example of how such a new, integrated public health surveillance system could be constructed. We integrated data from several sources with geographic information to create an indicator of tuberculosis risk for Olinda, a city in the Brazilian state of Pernambuco. In order to stratify the urban space in Olinda and to check for an association between the resulting TB risk gradient and the mean incidence of the disease between 1991 and 1996, we applied two different methods: 1) a "social deprivation index" and 2) principal component analysis followed by cluster analysis. Our results showed an association between social deprivation and the occurrence of TB. The results also highlighted priority groups and areas requiring intervention. We recommend follow-up that would include treating acid-fast bacilli smear-positive pulmonary TB cases, tracing of these persons' contacts, and monitoring of multidrug-resistant cases, all in coordination with local health services.|
In Brazil the traditional epidemiological surveillance approach has been "vertical" in nature and has focused on individuals at high risk and those persons' individual characteristics. This approach has brought about an accumulation of data in different information systems but with no easy communication between the systems, making it impossible to perform an epidemiological analysis based on geography. This situation has led to a discussion of the type, quantity, and quality of information obtained from the traditional morbidity and "operational" indicators typical of disease control programs.
Health service and epidemiological surveillance systems should help identify and plan interventions to control epidemics and endemics, which are intrinsically related to the areas where they occur. To do this, it will be necessary to transcend the conventional epidemiological focus on high-risk individuals and their personal characteristics, and to instead redirect pubic health policies towards defining "high-risk situations," where environmental and socioeconomic conditions increase the risk of the spread of disease (1). This shift in the focus of the public health surveillance system is all the more urgent given the increase in emerging and reemerging diseases that has occurred in the world over the last two decades (2-6).
Since tuberculosis (TB) is a major concern for Brazil, we chose that disease as an example of how such a new, integrated public health surveillance system could be constructed. In Brazil for the last 15 years the TB incidence rate has been high, remaining above 50 per 100 000 inhabitants (7). The disease caused some 100 000 deaths in the period from 1979 through 1995. More than 85% of the TB cases have been of the pulmonary form, and, among these pulmonary cases, more than 60% have been acid-fast bacilli smear-positive, constituting the group of greatest epidemiological importance.
Data from the Health Ministry of Brazil (7) indicate that in the state of Pernambuco, in northeastern Brazil, there is a more serious TB problem than in the country as a whole. There are also deficiencies in Pernambuco in the implementation of state and local public health services for TB control and patient treatment. While TB incidence in the nation overall has leveled off in recent years, the rate has climbed in Pernambuco (Figure 1). Mortality rates in Pernambuco are even more worrisome; the state has values that are nearly 1.5 times the average for the country as a whole.
In Pernambuco during the period of 1991-1996 there were 28 091 TB cases notified. For 18 555 of them (66%) information was missing about previous treatment those persons might have received (7). Among the remaining 9 536 cases, 1 623 of them (17%) were patients who had previously undergone treatment, including 1 430 (88%) who had begun the new course of therapy without a sputum culture or drug susceptibility test. Additionally, HIV testing was not done in 27 333 (97%) of these patients.
For our study we investigated the situation with TB in one location in Pernambuco, the city of Olinda. We correlated socioeconomic indicators from the demographic census with TB data from the national Disease Surveillance Warning System. Our objective was to identify sociodemographic risk factors statistically associated with TB in the city, within "defined populational bases" (populations living in areas with well-defined boundaries), making it possible to construct different levels of aggregation, including census tracts, neighborhoods, and sanitary districts (6, 8).
MATERIAL AND METHODS
Area and population under study
Olinda is a city located in the general metropolitan area of Recife, the state capital of Pernambuco (Figure 2). According to the demographic census of 1991, Olinda had 341 394 inhabitants as of 1 September 1991, and, according to the 1996 population count (9), 349 380 inhabitants on 1 August 1996, all living in urban areas. The demographic census takes place in Brazil every 10 years and collects demographic, socioeconomic, and urban-services information that is aggregated at the census tract level. The population count is done in the middle of the period between consecutive census counts and collects only information about the number of persons, by age and sex. The Brazilian Institute of Geography and Statistics is responsible for both the demographic census and the population count, which are carried out across the country.
Population growth in Olinda over the 1991-1996 period was approximately 0.5% per year, giving a projected population of a little more than 356 000 as of September 2000. Olinda has an area of 40.83 km2, and its population density of 8 719 inhabitants per square kilometer makes it one of the most densely populated areas of the country (1).
Sources of data
In our study we worked with data from three sources. One source was the Disease Surveillance Warning System. Administered by the Ministry of Health of Brazil, this national system provides information on all diseases of compulsory notification, including tuberculosis. The information concerns the identification, evolution, and treatment of cases, and it is generated by the local public health services and aggregated at the state and national levels (7).
We also obtained primary variables on socioeconomic and demographic characteristics for each census tract in Olinda, using the database from the 1991 demographic census (9). To estimate the population of Olinda for every year of the study period we used the data from the 1991 census and from the 1996 population count database (9).
For our third data source, mapping, we started with a digital map supplied by the Department of Cartographic Engineering at Pernambuco Federal University. This map provided the street plan for Olinda, to which we added further layers: 243 census tracts; 40 neighborhoods, and 2 sanitary districts (with 5 corresponding municipal health authority management areas) (Figure 2). The local government defines a sanitary district as an assembly of management areas that comprise a group of neighborhoods formed by a set of census tracts. This organization structure, adopted by the Municipal Health Authority, aims to decentralize the decision-making process; each sanitary district becomes responsible for the health services activities within its boundaries. We built up the census tracts layer using the 1991 census database of the Brazilian Institute of Geography and Statistics. That database provided a description of the limits of each of the 243 census tracts of Olinda. The neighborhoods were developed from analog maps supplied by the Olinda Municipal Planning Authority, and the sanitary districts were developed from analog maps supplied by the Olinda Municipal Health Authority. To build up these layers we used MaxiCAD 32 image acquisition (digitalization) software (MaxiDATA Technology & Informatics Ltd., Curitiba, Paraná, Brazil).
Data analysis and procedures
For our data analysis we first carried out a descriptive analysis of the occurrence of all forms of tuberculosis in Olinda, using the annual incidence rate as an indicator. We chose the period of 1991 through 1996 since census and population information was available for those two end points, from the 1991 demographic census and from the 1996 population count. In addition, 1996 is the latest year with complete information on new TB cases from the Disease Surveillance Warning System.
We considered the census tracts as the basic unit of analysis for our next step, the georeferencing of incident tuberculosis cases, that is, identifying the census tract where the residence of each case was located, based on the address recorded in the Disease Surveillance Warning System database.
One problem we found in using the census tracts as a unit of analysis was the lack of stability that the incidence rates showed when calculated for small areas, due to the small number of cases per year per census tract. We tentatively overcame this difficulty by calculating the mean annual TB incidence rate for the 1991-1996 period for each census tract (10-12). We next conducted geoprocessing of the primary variables from the 1991 census, using ArcView 3.0 software (Environmental Systems Research Institute, Inc., Redlands, California, United States of America). These variables, describing socioeconomic, demographic, and urban-services conditions at the census tract level, were used to construct a "social deprivation index" and to perform the principal component analysis that is described below.
Our next step concerned the spatial analysis of the distribution of the disease in relation to the census tracts, using a Poisson distribution to check the probability that this distribution had occurred by chance (13-15).
We then moved on to creating a collective risk status indicator, for which two distinct methodologies were applied. The two methods were: 1) a social deprivation index and 2) factor analysis to extract principal components, followed by cluster analysis.
Social deprivation index. The social deprivation index (SDI) was based on secondary variables that had been derived from primary variables in the 1991 census database (16). We intentionally built up these secondary variables, at the census tract level, for either their relationship with social deprivation or their importance in TB transmission. For example, using two primary variables, "average number of bedrooms per house" and "average number of people per house," we constructed a secondary variable of "average number of people per bedroom," which is strongly related with TB transmission.
There were six secondary variables: percent of family heads with less than 1 year of schooling, percent of persons 10 to 14 years old who were illiterate, percent of family heads with income below 1 minimum wage, percent of households without an indoor piped water supply, percent of households in subnormal dwellings (in shantytowns), and average number of inhabitants per bedroom.
Using methodology described elsewhere (1, 16), we then calculated the value of the social deprivation index for each of 241 census tracts (2 of the 243 census tracts do not have inhabitants and were thus not analyzed). The lowest values of SDI indicated the least social deprivation, and the highest SDI values corresponded to the greatest level of deprivation or risk.
We applied an analysis of variance model to check for an association between risk gradient (expressed in quintiles of SDI distribution) and TB incidence, using mean incidences over the 1991-1996 period for the set of census tracts that made up each of the quintiles.
Factor analysis. Our second methodology, factor analysis, was not based on a limited number of pri-mary variables intentionally chosen from the census database, as had been true with the SDI methodology. Instead, the factor analysis used a set of primary variables representing the whole spectrum of socioeconomic, demographic, and urban-services characteristics from the 1991 census. In this analysis we used 16 secondary variables, including the same 6 that were included in the SDI construction with the 241 residential census tracts. From this data matrix we determined the initial factors through principal component analysis (PCA), whose objective is to explain as much as possible the total variation in a set of data through a small number of factors, known as the principal components (17).
Once we had used PCA to identify the principal components, we took the first principal component (PC1) as the marker of the social deprivation "dimension," given the chosen variables. We then calculated "factor scores" that allowed us to order the census tracts according to the studied characteristic (17-19). As a means of establishing cutoff points for the stratification of the census tracts, we employed the k-means statistical clustering method, taking k = 5 to identify five clusters of census tracts (20, 21). This was in keeping with the logic of ultimately being able to plan local health service interventions.
To assess the association between the risk gradient resulting from this stratification and the incidence of TB, we used an analysis of variance model, taking the mean incidences over the 1991-1996 period for the set of census tracts that make up each one of the clusters identified by cluster analysis.
In our final stage we analyzed the distribution of the census tracts according to their classifications by SDI (five quintiles) and by PCA (five clusters). We used the kappa test to assess the level of agreement between the results of the two classification approaches (21).
In the period of 1991-1996 the Disease Surveillance Warning System recorded 2 102 incident cases of tuberculosis among residents of Olinda, resulting in an annual mean incidence rate of 101.5 cases per 100 000 inhabitants, approximately two times the average for Brazil as a whole. The annual incidence in Olinda ranged from a minimum of 79.3 per 100 000 to a maximum of 136.6 per 100 000.
Of the 2 102 incident cases of tuberculosis observed over the period, for 1 723 of them (82%) there was adequate information to georeference them by census tract; that georeferencing percentage exceeded 80% in every one of the 6 years of the period. Almost all the cases that lacked sufficient data to be georeferenced car-ried incomplete addresses, suggesting residence in a shantytown or a new settlement, rather than in a more-established area of Olinda. These 379 cases with insufficient data were not included in our study.
Analyzing the mean 1991-1996 incidences for each census tract, we found 11 tracts that averaged three or more TB cases per year. Although these 11 census tracts together corresponded to little more than 5% of the population of Olinda, they accounted for more than 15% of all the cases, with a mean incidence rate 2.3 times the average for Olinda. Testing these observed values against those expected according to the Poisson distribution, the chi-square test for goodness of fit rejected the hypothesis of no difference between observed and expected values (chi-square (c2) = 852.066, with 1 degree of freedom, P < 0.01). The results of this test suggest that the distribution of disease at the census-tract level is aggregated and that this distribution is not random. These results also underline the need for additional ways of stratifying this population and expressing different collective levels of risk of falling ill from TB, which we try to do in the sections that follow.
Social deprivation index
Table 1 shows the incidence of TB cases during the 1991-1996 period, for the five social deprivation index (SDI) quintiles. The table shows that the risk measured by SDI value rises along with the incidence of tuberculosis for the first four quintiles. The analysis of variance for the mean incidence rates for the quintiles showed that these means differ (P < 0.001). Employing the Duncan test to identify homogeneous groups (22), a significant difference was found between the incidence means of quintiles 1 and 2 and the incidence means of quintiles 3, 4, and 5.
Figure 3 is a map of tuberculosis risk in Olinda, classified according to SDI distribution quintiles and indicating relative priority for the five health management areas.
Principal component analysis
The principal component analysis followed on from the matrix of secondary variables. Table 2 shows the mean values and standard deviations for those variables. The table also shows the factor loadings obtained in the PCA, that is, the correlation coefficients for each of these variables and the first principal component (PC1). These results show high percentages of households without satisfactory sanitary installations and without regular garbage collection, as well as noticeable percentages of illiteracy among persons aged 10-14 years old and of extremely limited schooling among heads of family. Along with standard deviations near or above the mean for most of the variables, these figures indicate the existence of both extreme disparities in the distribution of these variables and great social inequalities.
The first principal component was capable of explaining around 43% of the total variation of the data set analyzed. Eight of the variables were strongly correlated with the "dimension" of social deprivation as indicated by PC1, with factor loadings of over 70%.
The factor scores, normalized for each census tract, served as a basis for k-means cluster analysis, using five clusters. Using analysis of variance to compare the mean values of these factor scores for the five clusters, we found that the difference among the five means was statistically significant (F = 1 268, degrees of freedom = 4 236, P < 0.01).
Table 3 shows TB incidence for the five social deprivation clusters defined in our analysis. The analysis of variance for the incidence rates found there were differences among the means that were statistically significant (P < 0.01). Using the Duncan test to identify homogeneous groups (22), we found a significant difference between the mean incidence rates of clusters 1 and 2 and the mean incidence rates of clusters 3 and 4.
Figure 4 shows the results of the cluster analysis, indicating the relative TB risk for different areas in Olinda. The layer containing the municipal health authority management areas has been superimposed on the map of risk in order to define the priority areas.
There was close agreement in the distributions of census tracts according to their deprivation level classification by SDI values and from using principal component analysis followed by cluster analysis; 175 out of 241 census tracts were identically classified (kappa = 0.689; z = 21.51; P < 0.01) (18).
Over the period studied, Olinda displayed high annual tuberculosis incidence rates, approximately two times the averages for Brazil as a whole. These high values underline the magnitude of the TB problem in Olinda, as well as the challenge of redefining the guiding principles of the existing epidemiological surveillance model.
As in other studies that have adopted the census tract as the basis of analysis (23, 24), this one has produced a stratified visualization of an urban area and has provided an overview of collective risk that can explain the differentials in the occurrence of tuberculosis in Olinda.
Our analysis shows that the distribution of tuberculosis cases in Olinda is not uniform and that there is an association between collective risk, as measured by two different methods, and tuberculosis incidence rates. The results of the kappa test showed there was close agreement between the two methodologies, with principal component analysis followed by cluster analysis validating the social deprivation index method, given the convergence between the areas of lowest and highest risk as indicated by the two approaches.
The correspondence between risk situation and incidence of tuberculosis was less than perfect. With the SDI approach, a higher TB incidence was observed in the fourth quintile than in the fifth quintile (Table 1). With principal component analysis and cluster analysis of the five strata, the cluster with the highest mean incidence was the middle one, with successively lower rates for the fourth and fifth clusters (Table 3). There are several possible explanations for these results. One is the fact that the great majority of the cases that could not be referenced to a particular census tract had incomplete addresses, suggesting location in the most deprived areas. Our results could also suggest that, beyond an intermediate level of deprivation, it may not be meaningful how much worse the social status becomes, since there are already conditions favoring the spread of TB. It is likely that the best model to explain the relation between social deprivation and TB is not a linear one. There appears to be a minimum incidence even when there is no social deprivation, as well as a high level of social deprivation beyond which there is no more increase in the incidence of TB.
Other issues of concern include the coverage of information systems and access to health services, both of which could be pointing to the underreporting of cases within the most deprived population (25).
The second methodology we used, principal component analysis followed by cluster analysis, appears better at focusing on the social deprivation "dimension," by using a larger number of variables and more refined techniques, thus showing itself to be more rigorous in several respects, including statistically.
On the other hand, the social deprivation index methodology works with variables chosen intentionally and that are also known to be related to poverty and transmission of TB. This may make the model easier to construct and easier for those planning and providing health service at any level to operate. Nevertheless, the fact that we constructed the social deprivation index with variables of the same weight, with no consideration of the relative importance of each one, could represent a methodological limitation. But while this may be important from an academic point of view, the SDI technique is simple and allows the synthesis of social deprivation situations that indicate collective risk.
If we analyze the risk strata within the management areas and sanitary districts, it becomes possible to define, or even redefine, control initiatives from a geographically based perspective (26). In the case of tuberculosis a new public health surveillance system will have to offer new guidelines based on the notion of collective risk. This is not a question of establishing new ways of treating TB, but rather of identifying high-risk populations and the areas where these persons live and establishing a more efficient surveillance system. Such a system would include strict control of acid-fast bacilli smear-positive carriers of pulmonary tuberculosis and their contacts, follow-up of drug-resistant cases, and treatment monitoring of cases registered by each health service, according to a geographic approach compatible with the organization of the health services in sanitary districts.
This work has proposed ending the lack of communication among traditional information systems and allowing the interactive analysis of health and sociodemographic data, such as we have done by integrating three data sources: the demographic census, the Disease Surveillance Warning System, and a geographical one (mapping). This type of integration could lead to organizing health services on the basis of the health needs of population groups in ways that encompass considerations of quality, quantity, and location. Additionally, the organization of health services should contemplate the different levels of complexity of health care within each sanitary district.
Acknowledgements. We are very grateful to the Pan American Health Organization for providing financial support to the development of this work.
1. Ximenes RA, Martelli CM, Souza WV, Lapa TM, Albuquerque Md, Andrade AL, et al. Vigilância de doenças endêmicas em áreas urbanas: a interface entre mapas digitais censitários e indicadores epidemiológicos. Cad Saude Publica 1999;15(1):109-118.
2. The global challenge of tuberculosis [editorial]. Lancet 1994;344(8918):277-279.
3. Farmer P. Social inequalities and emerging infectious diseases. Emerg Infect Dis 1996; 2(4):259-269.
4. Enarson DA, Grosset J, Mwinga A, Hersh-field ES, O'Brien R, Cole S, Reichman L. The challenge of tuberculosis: statements on global control and prevention. Lancet 1995; 346(8978):809-819.
5. Gamundi R. The emergence of TB signals dangers [Internet page]. AIDS Project Los Angeles. Available from: http://www.apla.org/ apla/9512/tuberculosis.html. Accessed July 1998.
6. Sabroza PC, Toledo LM, Osanai CH. A organização do espaço e os processos endêmico-epidêmicos. In: Leal MC, Sabroza PC, Rodrigues RH, Buss PM. Saúde, ambiente e desenvolvimento. Processos e consequências sobre as condições de vida. São Paulo, Hucitec, and Rio de Janeiro, Abrasco; 1992. pp. 57-77.
7. Brazil, Ministério da Saúde. Série histórica de casos de agravos e doenças infecciosas e parasitárias no Brasil, 1980 a 1996: tuberculose. Inf. epidemiológico SUS 1997;6(1):95-103.
8. Mendes EV, et al. Distritos sanitários: conceitos. In: Mendes EV. Distrito sanitário: o processo social de mudança das práticas sanitárias do Sistema Único de Saúde. São Paulo, Hucitec, and Rio de Janeiro, Abrasco; 1993. p. 150-185.
9. Fundação Instituto Brasileiro de Geografia e Estatística. Censos demográficos [Internet site]. Available at: http://www.ibge.gov.br. Accessed July 1998.
10. Dolk H, Mertens B, Kleinschmidt I, Walls P, Shaddick G, Elliot P. A standardisation approach to the control of socioeconomic confounding in small area studies of environment and health. J Epidemiol Community Health 1995;49 Suppl 2:S9-14.
11. Small-area variations: what are they and what do they mean? CMAJ 1992;146(4):467-470.
12. Carstairs V, Lowe M. Small area analysis: creating an area base for environmental monitoring and epidemiological analysis. Community Med 1986;8(1):15-28
13. Elliot P, Cuzick J, English D, Stern R, eds. Geographical and environmental epidemiology: methods for small-area studies. Oxford: Oxford University Press; 1996.
14. Hays WL. Statistics. Fort Worth, Texas, United States of America: Holt, Rinehart and Winston; 1988.
15. Sokal RR. Testing statistical significance of geographic variation patterns. Systematic Zoology 1979,28:227-232.
16. Fundo das Nações Unidas para a Infância. Municípios brasileiros: crianças e suas condições de sobrevivência. Brasília: Fundação Instituto Brasileiro de Geografia e Estatística; 1994.
17. Kleinbaun DG, Kupper LL, Muller KE. Applied regression analysis and other multivariable methods. Boston: Duxbury Press; 1987.
18. Souza J. Métodos estatísticos nas ciências psicossociais. Brasília, Brasil: Thesaurus; 1987.
19. Chatfield C, Collins AJ. Introduction to multivariate analysis. London: Chapman and Hall; 1986.
20. United States of America, Centers for Disease Control and Prevention. Guidelines for investigating clusters of health events. MMWR Morb Mortal Wkly Rep 1990;39(RR-11):1-23.
21. Altman DG. Practical statistics for medical research. London: Chapman and Hall; 1995.
22. Bailar JC, Mosteler F. Medical uses of statistics. Boston: NEJM Books; 1992.
23. Carstairs V, Morris R. Deprivation: explaining differences in mortality between Scotland and England and Wales. BMJ 1989;299(6704): 886-889.
24. Patterson CC, Waugh NR. Urban/rural and derivational differences in incidence and clustering of childhood diabetes in Scotland. Int J Epidemiol 1992;21(1):108-117.
25. Castelhanos PL. Sistemas nacionales de vigilancia de la situación de la salud según condiciones de vida y del impacto de las acciones de salud y bienestar. Washington, D.C.: Pan American Health Organization; 1993.
26. Albuquerque MFPM, Morais HMM. Decentralization of endemic disease control: intervention model for combating bancroftian filariasis. Rev Panam Salud Publica 1997;1(2): 155-163.
Manuscript received 3 December 1999. Revised version accepted for publication on 10 July 2000.
Uso de factores socioeconómicos en la localización de áreas con riesgo de tuberculosis en una ciudad del nordeste de Brasil
|En Brasil, el reto de satisfacer las necesidades de los residentes en zonas pobres ha generado discusiones sobre la sustitución del actual abordaje de la vigilancia epidemiológica por un sistema integrado de vigilancia de la salud pública. Este nuevo abordaje debería sustituir el tradicional enfoque en los individuos de alto riesgo por un método destinado a identificar las poblaciones de alto riesgo y las zonas en las que viven. Dada la magnitud del problema de la tuberculosis en Brasil, elegimos esta enfermedad como un ejemplo de cómo se podría concebir este nuevo sistema integrado de vigilancia de la salud pública. Se reunieron los datos de varias fuentes y la información geográfica para crear un indicador del riesgo de tuberculosis en Olinda, ciudad del estado de Pernambuco. Con el fin de estratificar el espacio urbano de Olinda y de buscar una asociación entre el gradiente del riesgo de tuberculosis y la incidencia media de la enfermedad entre 1991 y 1996, se aplicaron dos métodos diferentes: 1) un "índice social de pobreza", y 2) un análisis de componentes principales seguido de un análisis por grupos. Los resultados obtenidos revelaron una asociación entre la pobreza y la aparición de la tuberculosis y también señalaron grupos y zonas prioritarias que requerían intervención. Se recomienda un seguimiento que debería incluir el tratamiento de los casos de tuberculosis pulmonar con baciloscopia positiva, la identificación de los contactos de estos individuos y el control de los casos resistentes a múltiples fármacos, todo ello en coordinación con los servicios de salud locales.|
1 Fundação Oswaldo Cruz, Centro de Pesquisas Aggeu Magalhães, Recife, Pernambuco, Brasil. Send correspondence to: Wayner V. Souza, Fundação Oswaldo Cruz, Centro de Pesquisas Aggeu Magalhães, Av. Moraes Rego s/n°, Cidade Universitária, Recife, Pernambuco, Brasil, 50670-420; e-mail: firstname.lastname@example.org
2 Universidade Federal de Pernambuco, Recife, Pernambuco, Brazil.
3 Universidade de Pernambuco, Recife, Pernambuco, Brazil.
4 Universidade Federal de Goiás, Recife, Pernambuco, Brazil.