Revista Panamericana de Salud Pública
Print version ISSN 1020-4989
CUMSILLE, Francisco and BANGDIWALA, Shrikant I.. Categorizing variables in the statistical analysis of data: consequences for interpreting the results. Rev Panam Salud Publica [online]. 2000, vol.8, n.5, pp. 348-354. ISSN 1020-4989. http://dx.doi.org/10.1590/S1020-49892000001000005.
Frequently during the process of data analysis in epidemiological studies, the scale of one or more continuous variables is changed. The objective of this paper was to assess the consequences of categorizing variables during data analysis. We studied three situations with different scenarios for statistical analysis with regression models. The results show that dichotomizing continuous variables can substantially modify the relationships between dependent and independent variables. Thus, for example, in epidemiological studies trying to evaluate the effect of an exposure on a response, the magnitude and/or the direction of this effect can be biased by dichotomizing a variable. We therefore recommend avoiding, as much as possible, the categorization of variables when doing analyses.