Social epidemiology of a large outbreak of chickenpox in the Colombian sugar cane producer region: a set theory-based analysis


Epidemiología social de una gran epidemia de varicela en la región colombiana productora de caña de azúcar: un análisis basado en teoría de conjuntos



Alvaro J. IdrovoI; Cidronio Albavera-HernándezI, II; Jorge Martín Rodríguez-HernándezI

IInstituto Nacional de Salud Pública, Cuernavaca, México
IIHospital General Regional, Instituto Mexicano del Seguro Social, Cuernavaca, México





There are few social epidemiologic studies on chickenpox outbreaks, although previous findings suggested the important role of social determinants. This study describes the context of a large outbreak of chickenpox in the Cauca Valley region, Colombia (2003 to 2007), with an emphasis on macro-determinants. We explored the temporal trends in chickenpox incidence in 42 municipalities to identify the places with higher occurrences. We analyzed municipal characteristics (education quality, vaccination coverage, performance of health care services, violence-related immigration, and area size of planted sugar cane) through analyses based on set theory. Edwards-Venn diagrams were used to present the main findings. The results indicated that three municipalities had higher incidences and that poor quality education was the attribute most prone to a higher incidence. Potential use of set theory for exploratory outbreak analyses is discussed. It is a tool potentially useful to contrast units when only small sample sizes are available.

Chickenpox; Disease Outbreaks; Delivery of Health Care


Hay pocos estudios de epidemiología social sobre epidemias de varicela, aunque resultados previos sugieren un importante rol de los determinantes sociales. Este estudio describe el contexto de una gran epidemia de varicela en la región del Valle del Cauca, Colombia (2003 a 2007), con énfasis en algunos macro-determinantes. Exploramos las tendencias temporales de la incidencia de varicela en 42 municipios para identificar los lugares con mayor ocurrencia. Analizamos las características municipales (calidad de educación, cobertura de vacunación, desempeño de los servicios de salud, inmigración relacionada con violencia, y área cultivada con caña de azúcar) mediante análisis basados en teoría de conjuntos. Diagramas de Venn de Edward fueron usados para presentar los principales hallazgos. Los resultados indicaron que tres municipios tuvieron las mayores incidencias y que la educación de pobre calidad fue el atributo más propenso a mayor incidencia. El uso de la teoría de conjuntos para análisis exploratorio de epidemias con pocas unidades de análisis es discutido.

Varicela; Brotes de Enfermedades; Prestación de Atención de Salud




Chickenpox, or varicella, is a benign disease caused by infection with the varicella zoster virus (VZV). In general, cases of chickenpox appear among children between the ages of 1 and 14 1, and when the infection occurs in adolescents or adults the severity is higher than in children. In addition, it is potentially more frequent among immunosuppressed individuals 2. Migration is an important risk factor associated with the occurrence of chickenpox. When adults not exposed to VZV migrate to regions where chickenpox is endemic, the risk of infection is high 3. This has been described as occurring when an individual interacts with others in schools, homes and shopping centers 4.

Studies from Serbia and Montenegro 5, Puerto Rico 6, St. Lucia-West Indies 7, India, Southeast Asia 8, and Somalia 9, and with US Navy and Marine Corps recruits in island territories 10 report that chickenpox is a serious disease among adults in tropical climates, where seroprevalence is lower. However, a recent study in Australia reported that social and cultural characteristics are more significant than climate for VZV transmission 4, suggesting overlaps with some of the determinants of chickenpox outbreaks, which unfortunately are not known.

In the Cauca Valley region of Colombia, a large outbreak of chickenpox was observed and documented between 2003 and 2007 11. This outbreak affected children and adults in similar proportions. However, the causes of this epidemic were not explored. The region involved in this outbreak is located in western Colombia, which includes the Pacific Ocean coast, between 3° 05' and 5° 01' latitude N, 75° 42' and 77° 33' longitude W. The geography of the Cauca Valley region is varied, with coasts, mountains, jungles, and a very fertile valley. In the plains region, sugar cane constitutes the main crop, which is highly important since this relatively small region produces roughly 1.7% of the world's sugar.

The objective of this study was to describe the context of the large outbreak of chickenpox in this region, with special emphasis on any macro-determinants potentially related with incidence. Our a priori hypothesis was that the chickenpox outbreak was related to problems with the performance of health care services, lower educational levels among the population, the migration of vulnerable peoples, and/or changes in sugar cane-related processes. Since only a small sample size was available, it was decided to explore the use of set-theory methods to contrast municipalities. With this approach a simple method to link social epidemiology with field epidemiology was tested. Usual methods in social epidemiology include, for example, multilevel analysis or complex multivariate analyses, so the analysis is usually performed by experts.



A case series study was conducted, with the 42 Cauca Valley municipalities serving as observation units. Agency registries from the epidemiological surveillance system were used to obtain the number of clinical cases between 2003 and 2007. These data originated from the weekly reports of mandatory notification events. According to the Colombian epidemiological surveillance program, a case of chickenpox is clinically defined when a patient has mild to moderate fever with a few general symptoms associated with maculopapular and vesicular lesions that form granulose crusts 12.

Potential contextual determinants

To explore the social and physical environment, five robust indicators were used: (i) vaccination coverage, (ii) production function in a subsidized regime, (iii) education quality, (iv) area size of planted sugar cane, and (v) violence-related immigration. The first three indicators were extracted from the Cauca Valley's 2006 Municipal Management Report 13; the fourth was derived from agricultural data contained in Cauca Valley official statistics; and the last indicator was taken from the Colombian registries of displaced individuals. These indicators were selected as proxy variables related with our study hypothesis.

Vaccination coverage is an indicator based on a mass immunization plan (Plan Ampliado de Inmunización, PAI, in Spanish) 13. The outcome used for this indicator is the measles, mumps and rubella vaccine (MMR triple viral), which is considered to be a good indicator because it requires only one dose at one year old, and it is administered after the child has had all previous vaccines in the vaccination scheme. This indicator reflects the performance of primary health care services since good health care coverage is reflected by a municipality having a high percentage of vaccinations. Previous studies have reported a decrease in vaccination coverage over the last 15 years 14,15,16.

The production function in a subsidized regime is a composite indicator. It relates the economic resources from all financial sources with the expenditure on health care personnel dedicated to identify and insure the most vulnerable families, or to carry out stewardship activities 13. It was calculated as the ratio between individuals in a subsidized regimen/total municipal population x 100. The Colombian health system is based on managed care and, therefore, the separation of financing and the provision of health care functions are the principles used to promote cost-efficiency 17. The system has two types of affiliation: contributory and subsidized regime. The former covers those who have the ability to pay (people with full or partial employment) and the latter provides services to those who are not able to pay the necessary contributions (indigent and unemployed people). In this context, good performance in a subsidized regime theoretically has a high percentage of enrollement 13.

Education quality is an indicator constructed using the percentage of students with medium, high or very high scores on the national exam administered by the Colombian Institute for the Development of Higher Education (Instituto Colombiano para el Fomento de la Educación Superior, ICFES, in Spanish) 13. It is expressed as a percentage and a municipality has better performance when a high percentage of students have medium or high scores. The area size (hectares) of planted sugar cane is an indicator of the first process in the production of sugar and ethanol to produce fuel. Data used in this analysis were extracted from the Cauca Valley's 2007 official statistics for agricultural assessment. Original data are collected by the Regional Agricultural Planning Unit (Unidad Regional de Planificación Agropecuaria, URPA, in Spanish), and these are available at the official webpage (Evaluaciones agricolas 2000-2009., accessed on 25/Apr/2009).

Violence-related migration in Colombia is a complex demographic process. It is characterized as either a protracted internal displacement for which the processes of finding lasting solutions have stalled and/or when displaced individuals are marginalized as a consequence of violations or a lack of protection of human rights, including economic, social and cultural rights 18. According to some authors, threats by armed actors are the proximal cause of forced displacement and problems related to land possession are the distal cause 19. Said displacement increases the demand for basic services and infrastructure to satisfy migrants' needs in the reception municipality. Immigration data used for this study were extracted from the official government registries collected by Acción Social (Estadísticas de la población desplazada., accessed on 14/May/2009).

Table 1 summarizes these characteristics according to municipality. Note the heterogeneity among the municipalities. In general, the higher percentages for the attributes studied occur in Cali and other municipalities with greater population density and degree of urbanization.

Data analysis

First, a graphical description of the outbreak was developed based on incidence rates (per 100,000 inhabitants), using data from epidemiologic surveillance as the numerators and the official population estimates of each municipality as denominators. This procedure identified the municipalities with the higher incidences. After two basic set operations, intersection (∩) and union (U), were used to identify specific municipal characteristics potentially related to the occurrence of the outbreak. Intersection is defined as the set whose elements are elements of all the sets involved, and union as the set whose elements are elements of at least one of the sets involved in the operation.

Certain sets were characterized using these operations. The interpretation was based on rules described in Table 2. Sets were described per extension (a list with all its elements inside curly brackets), or intension (using a notation to denote the set containing all elements with the condition). In some cases, Venn or Edwards-Venn diagrams were used 20. To facilitate the interpretation, the concepts of "sufficient determinant" and "necessary determinant" in a manner similar to the Susser and Rothman's causality frameworks were used 21,22. However, it is important to remember that our analysis does not establish causality but, rather, contrasts contexts, which is one of the three proposed uses of small-N studies 23.



In our case, when a determinant is required to be present it is a "sufficient determinant", and when without its presence the higher incidence doesn't occur it is a "necessary determinant". These methods were considered appropriate since there were few observations available for a formal multiple statistical analysis.



The chickenpox outbreak is summarized in Figure 1. It was observed that the higher incidences occurred in 2006 and 2007. During these years it was evident that Pradera, El Dovio and Ulloa were the municipalities with the highest incidences of cases (the peaks were higher than 1,000 cases per 100,000 inhabitants), though the first high peak occurred in Ulloa in 2006. Other municipalities with a high number of cases (> 500 cases per 100,000 inhabitants) at any moment during the outbreak were Calima, La Cumbre, Obando, Restrepo, Riofrio, Roldanillo, Vijes and Zarzal. Note that in 2006, Pradera was the first municipality with incidences higher than 1,000 cases per 100,000 inhabitants. El Dovio and Ulloa did not experience this incidence level until 2007.



The following analysis is intended to identify the specific characteristics of Pradera, El Dovio and Ulloa, the municipalities with the highest incidences. Set analysis to identify specific characteristics allowed for the observation of some interesting aggregations (Figure 2). There was a specific set for Pradera characterized by: education quality less than 50%, vaccination coverage more than 80%, inconsistent data on subsidized regime affiliation, more than 10,000 hectares dedicated to sugar cane plantation, and violence-related immigration between 25 and 50 per 10,000 inhabitants (A ∩ H ∩ I ∩ R ∩ V). El Dovio had specific characteristics which included education quality less than 50%, vaccination coverage between 50-80%, subsidized regime between 65-75%, no sugar cane plantations, and violence-related immigration between 26-50 per 10,000 inhabitants (A ∩ G ∩ K ∩ N ∩ V). Ulloa was a municipality characterized by education quality higher than 90%, vaccination coverage more than 80%, subsidized regime more than 90%, no sugar cane plantations, and violence-related immigration higher than 75 per 10,000 inhabitants (D ∩ H ∩ M ∩ N ∩ X).

Note that sets A, H, N, and V were present in two municipalities, and the other sets were in one municipality exclusively. However, the set A ∩ V (lower level of education quality and an intermediate rate of violence-related immigration) is a very interesting combination of determinants that are present in Pradera and El Dovio. Finally, a complementary analysis with the other eight municipalities with higher incidences (Calima, La Cumbre, Obando, Restrepo, Riofrio, Roldanillo, Vijes and Zarzal) was conducted. The most frequent unitary sets for each determinant were: A (3/8) and D (3/8); F (4/8) and G (3/8); J (3/8); N (4/8), and X (5/8). Moreover, two intersectional sets were identified: A ∩ J ∩ W, which were present in Obando and Zarzal, and M ∩ N ∩ X, which were present in Restrepo. This latter set is also characteristic of Ulloa, as was described previously. Other sets (not shown) were identified but they were not frequent or did not change the described findings. Therefore, in this analysis there were no "sufficient" or "necessary determinants", although set A was the more important from an epidemiologic viewpoint.



The chickenpox outbreak described herein affected more than 26,000 individuals in the Cauca Valley region. The distribution of the outbreak was bimodal, with two peaks: the first in 2004 and the second in 2007. The set analysis identified specific municipal characteristics potentially related with the outbreak. The data suggest that municipalities with poorer education quality (set A) were more prone to higher incidences. This set and its superior adjacent were in 12 (28.6%) and 22 (52.4%) municipalities, respectively.

Unfortunately, we do not know of similar studies to which to compare these results, since the classic approach to studying outbreaks tends to identify causes only in terms of biological characteristics present among individuals. The study by Pollock & Golding 24, although it included some socioeconomic variables, did not explore contextual variables as in the present study. These authors included 21,123 British children and its main result was that social advantage was linked to patterns of susceptibility to VZV infection. However, this finding is not comparable with our results because the epidemiology of VZV in tropical climates is different. While in temperate climates children are the more affected population, in tropical climates chickenpox mainly affects young adults. This observation contrasts with Cauca Valley data, where all age groups were affected, suggesting the importance of determinants other than biological attributes, such as humidity or temperature.

Another important result was the usefulness of set theory to describe and contrast contextual attributes related to the outbreak. Due to the small sample size available, the use of conventional statistical methods does not provide clear results. Moreover, it is important to remember that the utilization of statistical methods based on probabilities is useful when wanting to make inferences to a population based on a sample; our study did not have an interest in making inferences to a "supra-population".

In addition, using the census from Cauca Valley's municipalities, with 100% of the available data, the identification of socio-historical specific attributes (or context) of the outbreak was attempted. It was more relevant because during 2008 - one year after the latest data used in our study - there was a sugar cane workers' strike, for which one of the most important motivations was poor working conditions. The fact that health and labor problems occurred jointly requires studying the context that allows for the development of both social phenomena.

Set theory methods with different degrees of complexity have been used in several fields, such as diagnostic imaging, genetics, gerontology, homeopathy, immunology, pneumology, and pharmacology. Of these, the only epidemiological studies are those by Soriano et al. 25 and Viegi et al. 26. To our knowledge, the present study is the first investigation of an outbreak using set theory. It is our opinion that the use of mathematical methods such as set theory can be complementary when used with an understanding of their rationality. With mathematics, it is possible to explore data in cases that do not fulfill the law of large numbers.

Although small-N studies are not frequently used in epidemiology (the more well-known exception is the "N = 1 clinical trial") 27, they are an important methodological tool for social sciences (for instance, in historical and comparative research) 23,28,29,30. They are acceptable to describe differences, ascertain determinants, and establish causality in a few cases. According to Lieberson 30, causality in small-N studies is possible only if four assumptions are accepted: (i) a deterministic causal approach, (ii) no measurement errors, (iii) unicausality, and (iv) absence of interaction. With the use of set theory, however, the last two assumptions are not necessary. Union and intersection operations accept the multicausality of disease and allow for exploring interactions between variables; nevertheless, the first two assumptions must be met.

Some limitations of this study should be considered to understand the scope of the results described. The most important constraint is the lack of understanding of the whole causal web involved in the complexity of a single outbreak. We recognize that many non-measured factors may potentially be related with the occurrence of an outbreak. And yet, this study was able to identify some of the contextual attributes related to the high incidence of VZV in certain municipalities. Additionally, in this study measurement error was possibly present for all attributes analyzed.

In conclusion, the study of outbreaks requires more than individual variables to understand fully causality. The contexts in which high incidences occur can be different (variables) or constant when analyzing certain populations, but with conventional epidemiologic methods it is only possible to explore in cases of heterogeneity. Thus, the description and contrast of contexts is important, even with small sample sizes. This study made such an analysis possible by using simple set theory operations. Furthermore, similar studies could incorporate a similar approach to improve the understanding of the causes of outbreaks. Simple tools such as those described herein may better integrate field epidemiology and social epidemiology. These methods can be used to increase the number of studies on social epidemiology of infectious diseases 31.



A. J. Idrovo participated in the study design, analysis and interpretation of data, article write-up and approval of the final version. C. Albavera-Hernández and J. M. Rodríguez-Hernández participated in the analysis and data interpretation, write-up and approval of the final version.



To the technical and professional teams of the Cauca Valley Secretary of Health, for providing us with part of the information used in this study.



1. Preblud SR, Orenstein WA, Bart KJ. Varicella: clinical manifestations, epidemiology and health impact in children. Pediatr Infect Dis 1984; 3:505-9.         

2. Shahbazian H, Ehsanpour A. An outbreak of chickenpox in adult renal transplant recipients. Exp Clin Transplant 2007; 5:604-6.         

3. Kjersem H, Jepsen S. Varicella among immigrants from the tropics, a health problem. Scand J Soc Med 1990; 18:171-4.         

4. O'Grady KA, Merianos A, Patel M, Gilbert L. High seroprevalence of antibodies to varicella zoster virus in adult women in a tropical climate. Trop Med Int Health 2000; 5:732-6.         

5. Maretic Z, Cooray MP. Comparisons between chickenpox in a tropical and a European country. J Trop Med Hyg 1963; 66:311-5.         

6. Longfield JN, Winn RE, Gibson RL, Juchau SV, Hoffman PV. Varicella outbreaks in Army recruits from Puerto Rico. Varicella susceptibility in a population from the tropics. Arch Intern Med 1990; 150:970-3.         

7. Garnett GP, Cox MJ, Bundy DA, Didier JM, St Catharine J. The age of infection with varicella-zoster virus in St Lucia, West Indies. Epidemiol Infect 1993; 110:361-72.         

8. Lee BW. Review of varicella zoster seroepidemiology in India and Southeast Asia. Trop Med Int Health 1998; 3:886-90.         

9. Nysse LJ, Pinsky NA, Bratberg JP, Babar-Weber AY, Samuel TT, Krych EH, et al. Seroprevalence of antibody to varicella among Somali refugees. Mayo Clin Proc 2007; 82:175-80.         

10. Struewing JP, Hyams KC, Tueller JE, Gray GC. The risk of measles, mumps, and varicella among young adults: a serosurvey of US Navy and Marine Corps recruits. Am J Public Health 1993; 83:1717-20.         

11. Albavera-Hernández C, Rodríguez-Hernández JM. Situación epidemiológica de varicela en el municipio de Pradera (Valle del Cauca, Colombia) entre 2003 a 2007. Salud UNINORTE 2010; 26:54-64.         

12. Ministerio de la Protección Social/Laboratorio Nacional de Referencia, Subdirección de Epidemiología, Instituto Nacional de Salud. Protocolo de vigilancia de varicela, versión actualizada. Bogotá DC: Instituto Nacional de Salud; 2007.         

13. Gobernación del Valle del Cauca. Evaluación de la gestión municipal 2006. Santiago de Cali: Secretaria de Planeación; 2007.         

14. Acosta-Ramírez N, Rodríguez-García J. Inequidad en las coberturas de vacunación infantil en Colombia, años 2000 y 2003. Rev Salud Pública 2006; 8 Suppl 1:102-15.         

15. de la Hoz F, Perez L, Wheeler JG, de Neira M, Hall AJ. Vaccine coverage with hepatitis B and other vaccines in the Colombian Amazon: do health worker knowledge and perception influence coverage? Trop Med Int Health 2005; 10:322-9.         

16. Ruiz-Rodríguez M, Vera-Cala LM, López-Barbosa N. Seguro de salud y cobertura de vacunación en población infantil con y sin experiencia de desplazamiento forzado en Colombia. Rev Salud Pública 2008; 10:49-61.         

17. Iriart C, Merhy E, Waitzkin H. Managed care in Latin America: the new common sense in health policy reform. Soc Sci Med 2001; 52:1243-53.         

18. Internal Displacement Monitoring Centre/Norwegian Refugee Council. Internal displacement: global overview of trends and developments in 2008. Geneva: Internal Displacement Monitoring Centre; 2009.         

19. Brittain JJ. A theory of accelerating rural violence: Lauchlin Currie's role in underdeveloping Colombia. J Peasant Stud 2005; 32:335-60.         

20. Edwards AWF, Stewart I, Pekonen O, Hamburger P. The story of Venn diagrams. Mathematical Intelligencer 2005; 27:36-8.         

21. Susser M. What is a cause and how do we know one? A grammar for pragmatic epidemiology. Am J Epidemiol 1991; 133:635-48.         

22. Rothman KJ. Causes. Am J Epidemiol 1976; 104:587-92.         

23. Skocpol T, Somers M. The uses of comparative history in macrosocial inquiry. Comp Stud Soc Hist 1980; 22:174-97.         

24. Pollock JI, Golding J. Social epidemiology of chickenpox in two British national cohorts. J Epidemiol Community Health 1993; 47:274-81.         

25. Soriano JB, Davis KJ, Coleman B, Visick G, Mannino D, Pride NB. The proportional Venn diagram of obstructive lung disease: two approximations from the United States and the United Kingdom. Chest 2003; 124:474-81.         

26. Viegi G, Matteelli G, Angino A, Scognamiglio A, Baldacci S, Soriano JB, Carrozzi L. The proportional Venn diagram of obstructive lung disease in the Italian general population. Chest 2004; 126:1093-101.         

27. Guyatt GH, Keller JL, Jaeschke R, Rosenbloom D, Adachi JD, Newhouse MT. The N-of-1 randomized controlled trial: clinical usefulness. Our three-year experience. Ann Intern Med 1990; 112:293-9.         

28. Mahoney J. Strategies of causal inference in small-N analysis. Sociol Methods Res 2000; 28:387-424.         

29. Goldthorpe JH. Causation, statistics, and sociology. Eur Sociol Rev 2001; 17:1-20.         

30. Lieberson S. Small N's and big conclusions: an examination of the reasoning in comparative studies based on a small number of cases. Soc Forces 1991; 70:307-20.         

31. Cohen JM, Wilson ML, Aiello AE. Analysis of social epidemiology research on infectious diseases: historical patterns and future opportunities. J Epidemiol Community Health 2007; 61:1021-7.         



A. J. Idrovo
Instituto Nacional de Salud Pública
Avenida Universidad 655, Col. Sta Ma. Ahuacatitlán
Cuernavaca, Morelos
62100, México

Submitted on 22/Jul/2010
Final version resubmitted on 19/Mar/2011
Approved on 29/Mar/2011

Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz Rio de Janeiro - RJ - Brazil