Multivariate estimate of eating patterns: is the whole different from the parts?

Iolanda Karla Santana dos Santos Wolney Lisbôa Conde Alicia Matijasevich Manitto About the authors

ABSTRACT:

Objective:

To describe the correlations between eating patterns for the years 2007 to 2012, and for each year of the period from 2007 to 2012.

Method:

Cross-sectional study with data from the System of Surveillance of Risk and Protection Factors to Chronic Diseases by Telephone Survey with the selection of 167,761 individuals aged 18 to 44 years old. Eating patterns were identified with a Principal Component Analysis. To compare the effects of the extraction and the estimate of eating patterns among different surveys we conducted the following analyzes: in the first, we used the total data set for the years from 2007 to 2012; in the second, the patterns were estimated in each annual set of data for the period from 2007 to 2012. Steps 1 and 2 were performed with no rotation, with Varimax rotation and with Promax rotation. After extracting the patterns, standardized scores with zero mean were generated for each pattern. The association between the patterns generated in the analyzes was estimated by the Pearson correlation coefficient (r).

Results:

In the non-rotated analyzes, the components retained in the set presented correlations that were higher than 0.90, with the retained patterns in each year. In the rotated analyzes, only the first component had correlations that were higher than 0.90.

Conclusion:

Estimates of eating patterns either segmented - year by year - or in general - all of the years - showed high correlation and consistency between the patterns identified when in the same data pool.

Keywords:
Epidemiology; Surveillance; Eating behavior

INTRODUCTION

An analysis of dietary patterns is preferable to describing diets by type of food or nutrient, because food consumption is determined by multiple factors and food choices and their nutrients does not occur randomly11. Hu FB. Dietary pattern analysis: a new direction in nutritional epidemiology. Curr Opin Lipidol [Internet] 2002; 13(1): 3-9. Disponível em: http://journals.lww.com/co-lipidology/Fulltext/2002/02000/Dietary_pattern_analysis__a_new_direction_in.2.aspx
http://journals.lww.com/co-lipidology/Fu...
. Considering that food consumption is not random and there is a correlation between food and nutrients, the study of diets using patterns has been widespread11. Hu FB. Dietary pattern analysis: a new direction in nutritional epidemiology. Curr Opin Lipidol [Internet] 2002; 13(1): 3-9. Disponível em: http://journals.lww.com/co-lipidology/Fulltext/2002/02000/Dietary_pattern_analysis__a_new_direction_in.2.aspx
http://journals.lww.com/co-lipidology/Fu...
.

In general, analyzes that compare dietary patterns, estimated through multivariate analysis, between two or more surveys, are conducted in each period separately, which makes it difficult to compare dietary patterns. This is because the composition and the order of importance in the explanation of the variability are modified according to how the data set is treated. Alternatively, it is possible to estimate the patterns in the total set of surveys and then calculate the scores for each pattern according to the periods, or other strata in the data set.

The aim of this study was to describe the correlations between dietary patterns for the set of years from 2007 to 2012 and for each year in the same period.

METHODS

This was a cross-sectional study with data from the Surveillance System for Risk and Protection Factors for Chronic Diseases by Telephone Survey (Sistema de Vigilância de Fatores de Risco e Proteção para Doenças Crônicas por Inquérito Telefônico - Vigitel) from 2007 to 2012. Aspects related to Vigitel’s research methodology are available in official publications22. Brasil. Ministério da Saúde. VIGITEL Brasil 2012. Vigilância de fatores de risco e proteção para doenças crônicas por inquérito telefônico. Estimativas sobre frequência e distribuição sociodemográfica de fatores de risco e proteção para doenças crônicas nas capitais dos 26 estados brasileiros e no Distrito Federal em 2012. Brasília: Ministério da Saúde; 2013..

In this study, 167,761 individuals aged 18 to 44 years old were selected. The food consumption variables selected were: weekly frequency of consumption of beans, vegetables, raw vegetables, cooked vegetables, red meat, chicken, fruits, soft drinks or artificial juice, milk, daily vegetable consumption and consumption of visible fat.

Dietary patterns were identified with the Principal Component Analysis (PCA). PCA is a factor analysis that reduces data into patterns based on the correlations between the variables33. Olinto MTA. Padrões Alimentares. In: KAC G, SICHIERI R, GIGANTE DP, eds. Epidemiologia Nutricional. 20ª ed. Rio de Janeiro: Fiocruz/Atheneu; 2007. p. 213-25.. The first main component corresponds to the direction of greatest variance, and the other components are orthogonal to the previous components44. Lyra W da S, Silva EC da, Araújo MCU de, Fragoso WD, Veras G. Classificação periódica: um exemplo didático para ensinar análise de componentes principais. Quím Nova 2010; 33(7): 1594-7. https://doi.org/10.1590/S0100-40422010000700030
https://doi.org/https://doi.org/10.1590/...
. Rotations are used in order to improve the interpretation of the extracted components. Varimax rotation of the orthogonal matrix maximizes the variation between the factorial loads, and the components remain not correlated. Promax oblique matrix rotation rotates the axes so that the vertices can have angles other than 90 degrees. In this type of rotation, the probability of some association between the components cannot be ruled out55. Jolliffe IT. Principal Component Analysis. 2ª ed. Nova York: Springer; 2002..

To compare the effects of extraction and the estimation of dietary patterns between different surveys, we conducted the following analyzes:

  • in the first, we used Vigitel’s total data set for the years 2007 to 2012;

  • in the second, the patterns were estimated in each Vigitel annual data set for the period from 2007 to 2012.

Steps 1 and 2 described above were performed with no rotation, with Varimax rotation and with Promax rotation. In the analysis, the components with eigenvalues> 1.0 were retained, according to the Kaiser rule55. Jolliffe IT. Principal Component Analysis. 2ª ed. Nova York: Springer; 2002.. We considered the number of patterns retained in the first stage. After extracting the patterns, standardized scores were calculated with an average of zero for each one, so that each individual received a standardized value that represented their adherence to each of the patterns analyzed. The patterns were named according to their order of retention, that is, the first pattern was named CP1, the second CP2, and so on. The association between the patterns generated in the analyzes described above was estimated by Pearson’s correlation coefficient (r). The analyzes were conducted using the Stata program (Stata Corporation, College Station, United States).

Vigitel was approved by the National Human Research Ethics Commission of the Ministry of Health22. Brasil. Ministério da Saúde. VIGITEL Brasil 2012. Vigilância de fatores de risco e proteção para doenças crônicas por inquérito telefônico. Estimativas sobre frequência e distribuição sociodemográfica de fatores de risco e proteção para doenças crônicas nas capitais dos 26 estados brasileiros e no Distrito Federal em 2012. Brasília: Ministério da Saúde; 2013.. For Vigitel, free and informed consent was obtained orally at the time of telephone contact with the interviewees. The present study was assessed and approved by the Research Ethics Committee of the School of Public Health of the Universidade de São Paulo under Report number 1,885,826 of January 5, 2017.

This article comes from the master’s dissertation of the author, Iolanda Karla Santana dos Santos. It is entitled Patterns of food consumption and physical activity based on data from VIGITEL and was presented to the Graduate Program in Nutrition in Public Health of the School of Public Health from the Universidade de São Paulo.

RESULTS

Table 1 shows the correlations between the patterns retained in the 2007 to 2012 set and for each year of the same period, with no rotation, and with the Varimax and Promax rotations. In the analyzes with no rotations, the components retained in the 2007 to 2012 set showed correlations greater than 0.90 with the retained patterns in each year, separately. In the rotational analyzes, only the first component showed correlations greater than 0.90 in all of the years.

Table 1.
Correlations of principal components (PC) not rotated and rotated retained for the set of years 2007-2012 and retained in analysis for each year. Surveillance System for Risk and Protection Factors for Chronic Diseases by Telephone Survey, 2007 - 2012.

DISCUSSION

Our results indicate that:

  • PCA analysis can be used in time series data sets with the same sample structure;

  • depending on the purpose of the study, it is not advisable to use Varimax or Promax rotation after retaining the components.

In this study, with six years of monitoring and pattern retention with eigenvalues > 1.0, the correlations between the retained patterns in the set and for each year with no rotation were greater than 0.90, showing high internal consistency. Regarding the patterns that did not remain in the comparative analyzes, some pairs showed correlations below 0.90.

In an expanded analysis (data not shown can be requested from the authors), in which we included all of the years of monitoring, the correlations between some of the patterns extracted from the set and equivalent patterns extracted from the databases, separated by year, were less than 0.90. This lower association is exactly the reflection of changes in the consumption of diet components, distributed among the population and relevant to the interpretation of changes in dietary patterns. In this case, without analyzing the databases together, it would be impossible to interpret the changes in dietary patterns that occurred in the period.

CONCLUSION

PCA and other multivariate techniques contribute widely to time series studies, and their interpretation becomes more effective without the use of statistical adjustments such as vector rotation, which is more useful when pursuing other objectives.

References

  • 1
    Hu FB. Dietary pattern analysis: a new direction in nutritional epidemiology. Curr Opin Lipidol [Internet] 2002; 13(1): 3-9. Disponível em: http://journals.lww.com/co-lipidology/Fulltext/2002/02000/Dietary_pattern_analysis__a_new_direction_in.2.aspx
    » http://journals.lww.com/co-lipidology/Fulltext/2002/02000/Dietary_pattern_analysis__a_new_direction_in.2.aspx
  • 2
    Brasil. Ministério da Saúde. VIGITEL Brasil 2012. Vigilância de fatores de risco e proteção para doenças crônicas por inquérito telefônico. Estimativas sobre frequência e distribuição sociodemográfica de fatores de risco e proteção para doenças crônicas nas capitais dos 26 estados brasileiros e no Distrito Federal em 2012. Brasília: Ministério da Saúde; 2013.
  • 3
    Olinto MTA. Padrões Alimentares. In: KAC G, SICHIERI R, GIGANTE DP, eds. Epidemiologia Nutricional. 20ª ed. Rio de Janeiro: Fiocruz/Atheneu; 2007. p. 213-25.
  • 4
    Lyra W da S, Silva EC da, Araújo MCU de, Fragoso WD, Veras G. Classificação periódica: um exemplo didático para ensinar análise de componentes principais. Quím Nova 2010; 33(7): 1594-7. https://doi.org/10.1590/S0100-40422010000700030
    » https://doi.org/https://doi.org/10.1590/S0100-40422010000700030
  • 5
    Jolliffe IT. Principal Component Analysis. 2ª ed. Nova York: Springer; 2002.

  • Financial support: none

Publication Dates

  • Publication in this collection
    08 July 2020
  • Date of issue
    2020

History

  • Received
    29 July 2019
  • Reviewed
    11 Nov 2019
  • Accepted
    18 Nov 2019
Associação Brasileira de Pós -Graduação em Saúde Coletiva São Paulo - SP - Brazil
E-mail: revbrepi@usp.br