Reliability of screening tests for health-related problems among low-income elderly

Confiabilidade de testes de triagem para problemas de saúde de idosos pobres

Confiabilidad de las pruebas de detección de problemas de salud en ancianos pobres

Valéria Teresa Saraiva Lino Margareth Crisóstomo Portela Luiz Antônio Bastos Camacho Nádia Cristina Pinheiro Rodrigues About the authors

Abstracts

Screening tests for health problems can identify elderly people who should undergo the Comprehensive Geriatric Assessment, enabling the planning of actions to prevent disability. The aim of this study was to analyze the inter-rater reliability (IRR) of self-assessment questions (SAQ) and performance tests (PT) recommended in Brazil, in a sample of low-income elderly people, through an exploratory study performed with 165 elderly assessed by two professionals on different days. IRR was evaluated using the intraclass correlation coefficient (ICC) for continuous variables and the kappa statistic for categorical ones. The IRR for the PT (muscle strength, mobility body mass index, vision) was excellent and presented ICC values greater than 0.75. By contrast, the IRR for SAQ (urinary incontinence, self-perceived health and hearing impairment) was intermediate. Only the fall-related item presented a good IRR. In this study single SAQ had poor reliability when compared to PT, suggesting the necessity of revision of subjective self-assessment items with low reproducibility before implementation.

Triage; Self-Assessment; Aged


Testes de triagem de problemas de saúde podem identificar idosos aptos à avaliação geriátrica ampla, possibilitando o planejamento de intervenções, prevenindo-se o declínio funcional. O objetivo deste trabalho foi analisar a confiabilidade interaferidores (CI) de questões de autoavaliação (QAA) e testes de desempenho (TD) recomendados no Brasil, em uma amostra de idosos pobres. Realizou-se estudo exploratório em que dois profissionais aplicaram os testes em dias diferentes a 165 idosos. Utilizou-se o coeficiente de correlação intraclasse (CCI) e a estatística kappa para as variáveis contínuas e categóricas, respectivamente. O CCI para os TD (força, índice de massa corporal, mobilidade, visão) foi excelente (> 0,75). As QAA de incontinência urinária, autopercepção de saúde e deficiência auditiva apresentaram kappa com valores intermediários. Apenas a QAA relacionada à queda teve boa CI. Neste estudo, QAA tiveram baixa CI quando comparadas a TD, sugerindo que itens subjetivos com pouca reprodutibilidade deveriam ser revistos antes da sua implementação.

Triagem; Autoavaliação; Idoso


Las pruebas de detección de problemas de salud identifican a ancianos que deben someterse a una revisión médica completa en Geriatría, lo que permite la planificación de acciones de prevención de discapacidades. Se analizó la fiabilidad entre evaluadores (FEA) de preguntas de autoevaluación (PA) y pruebas de rendimiento (PR) que se recomiendan en Brasil, en un estudio exploratorio realizado con 165 ancianos de baja renta, evaluados por dos profesionales en diferentes días. La FEA se evaluó mediante el coeficiente de correlación intraclase (CCI) para las variables continuas y el índice kappa para las categóricas. La FEA para el PR (fuerza muscular, índice de masa corporal de movilidad, visión) era excelente y presentan CCI con valores superiores a 0,75. Por el contrario, la FEA para las PA (incontinencia urinaria, salud autopercibida y auditiva) fue intermedia. Las PA sobre caídas presentaron una buena FEA. La escasa fiabilidad de las PA sugiere la necesidad de una revisión de los elementos subjetivos de autoevaluación con baja reproducibilidad antes de la implementación.

Triaje; Autoevaluación; Anciano


Introduction

The increase in the proportion of Brazil's elderly population has contributed to the non-communicable disease epidemic with concomitant increases in morbidity and mortality11. Schmidt MI, Duncan BB, Azevedo e Silva G, Menezes AM, Monteiro CA, Barreto SM, et al. Chronic non-communicable diseases in Brazil: burden and current challenges. Lancet 2011; 377:1949-61.. The consequence of these demographic and epidemiological changes is an increase in the occurrence of disability. Older people are likely to have multiple conditions interacting in different ways, leading to difficulties in performing important tasks22. Sousa RM, Ferri CP, Acosta D, Guerra M, Huang Y, Jacob K, et al. The contribution of chronic diseases to the prevalence of dependence among older people in Latin America, China and India: a 10/66 Dementia Research Group population-based survey. BMC Geriatr 2010; 10:53.. The Comprehensive Geriatric Assessment (CGA), a time-demanding multidimensional diagnostic process, is aimed at detecting the biological, psychological and social disorders that affect this group33. Rubenstein LZ, Alessi CA, Josephson KR, Trinidad Hoyl M, Harker JO, Pietruszka FM. A randomized trial of a screening, case finding, and referral system for older veterans in primary care. J Am Geriatr Soc 2007; 55:166-74. but due to the high demand for care, it is not applicable to all elderly. A challenge is to establish a rapid examination to detect health problems which could direct the application of CGA and postpone disability.

In Brazil, the Ministry of Health recommends the use of the Rapid Multidimensional Evaluation of Elderly Persons in primary care, containing items related to multiple capabilities44. Ministério da Saúde. Avaliação multidimensional rápida da pessoa idosa. In: Departamento de Atenção Básica, Secretaria de Atenção à Saúde, Ministério da Saúde, organizador. Envelhecimento e saúde da pessoa idosa. Brasília: Ministério da Saúde; 2007. p. 48-9.. However, most questions have not been validated, and some of them address matters whose appropriateness is questionable for a screening test.

This work is concerned with identifying some rapid and reliable screening tests already applied in Brazil to select those elderly patients that should be submitted to the CGA.

The reliability of measurements may vary depending on the context55. Hasselmann MH, Lopes CS, Reichenheim ME. Measurement reliability in a study on family violence and severe acute malnutrition. Rev Saúde Pública 1998; 32:437-46., differing among population subgroups. Furthermore, the higher subjectivity of the test, the greater the possibility of variation in the interpretation of the results by different observers66. Gordis L. Epidemiologia. 4a Ed. Rio de Janeiro: Revinter; 2010..

Due to the importance of psychometric and sociocultural considerations when administering an instrument this study aimed to assess the inter-rater reliability (IRR) of health-related tests in a low-income elderly community.

Methods

Study and sample

This research was part of an exploratory study aimed at developing a strategy of rapid assessment of the elderly. It was performed at the primary healthcare unit of the Oswaldo Cruz Foundation (Fiocruz), in Manguinhos, in the city of Rio de Janeiro, Brazil. In Manguinhos, most houses have a single room, monthly family income is usually lower than a the minimum wage, and more than 50% of residents have no more than an elementary school education77. Carvalho MAP, Pivetta F. The integrated territory of health care in Manguinhos: we are all apprentices. Rio de Janeiro: Fundação Oswaldo Cruz; 2012..

The non-probabilistic sample was drawn from users that were 60 years or older who received care from the Family Health Team. Individuals with advanced cognitive and sensorial deficits or impaired locomotion were excluded. The sample size was calculated using the prevalence of depression, estimated at 20% in the elderly, as a reference88. Li C, Friedman B, Conwell Y, Fiscella K. Validity of the Patient Health Questionnaire 2 (PHQ-2) in identifying major depression in older people. J Am Geriatr Soc 2007; 55:596-602.. A kappa coefficient of 0.6 with a 95% confidence interval was used to generate a conservative sample size of 180 individuals, estimated using the WinPepi application, version 2009 (http://www.brixtonhealth.com/pepi4windows.html).

Procedures

Data collection occurred from June to December 2010. Three health professionals were responsible for the standardization of the techniques used in the study during a three-hour meeting. Assessments were independently administered in two sessions. First a geriatrician applied the CGA and seven to 15 days later, either a psychomotor specialist or a social worker applied the tests whose IRR would be assessed. This interval aimed to avoid memory bias in favor of higher reliability and not to exhaust the patient after the long application of CGA.

Both assessments included rapid performance tests (PT) and self-assessment questions (SAQ) recommended for screening of health problems in Brazil99. Pereira LSM, Gomes G. Avaliação funcional. In: Guimarães RM, Cunha UGV, editors. Sinais e sintomas em geriatria. São Paulo: Atheneu; 2004. p. 17-30.,1010. Valete-Rosalino CM, Rozenfeld S. Auditory screening in the elderly: comparison between self-report and audiometry. Braz J Otorhinolaryngol 2005; 71:193-200. (Figure 1). Self-perceived health (SPH) is a measure of health associated with disability in the elderly1111. Zajacova A, Dowd JB. Reliability of self-rated health in US adults. Am J Epidemiol 2011; 174:977-83. and could be indicative to referring them to the CGA.

Figure 1
Performance tests (PT) and self-assessment questions (SAQ). * Grasping force was assessed with a Crown dynamometer (dinamômetro Crown manual, Oswaldo Filizola), using Fried et al.'s (20) criteria, utilizing the best result of two trials obtained with the dominant hand.

Statistical analysis

IRR was evaluated using the intra-class correlation coefficient (ICC) for continuous variables and the kappa statistic for categorical variables, adjusting for prevalence and asymmetries with the Prevalence-Adjusted Bias-Adjusted Kappa (PABAK) technique1212. Abramson J. WINPEPI (PEPI-for-Windows): computer programs for epidemiologists. Epidemiol Perspect Innov 2009; 1:6.. The classification of IRR followed the recommendations of Landis and Koch: kappa values greater than 0.75 or below 0.40 represent excellent or poor agreement, respectively1313. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33:159-74.. Values between these levels denote intermediate to good agreement. We adopted the same rule for the ICC. SPSS version 13 (SPSS Inc., Chicago, U.S.A.) was used to perform statistical analyses.

Study ethics

The Ethics Research Committee of the Sergio Arouca National Public Health School/Fiocruz (report number 126/10) approved the research. All participants signed an informed consent form, which pledges anonymity and confidentiality of the information.

Results

The first and the second sessions evaluated 185 e 165 individuals respectively. Five were excluded due to visual or cognitive impairment. No significant differences in sociodemographic characteristics were detected between the elderly who completed the study and those who did not. The majority of participants were female (73.0%) and there was a slight majority of single or widowed participants (Table 1).

Table 1
Sociodemographic data and health problems identified in the geriatric assessment (N = 180).

For the PT items, IRR was excellent (K > 0.75). By contrast, for SAQ items IRR was intermediate (0.75 > K > 0.40), except for the fall-related item, in which IRR was good (Table 2).

Table 2
Reliability of screening items (N = 165).

Discussion

The IRR between the PT items were excellent, unlike measures assessed by the SAQ. The possibility of low reliability for subjective measures can occur, despite adequate training of examiners. Sociocultural factors, psychological issues, memory lapses and lack of insight of informants lead to variation in how people communicate information about symptoms1414. Shrout PE. Measurement reliability and agreement in psychiatry. Stat Methods Med Res 1998; 7:301-17.. In addition, the reliability of the information may be compromised when there is no socially acceptable environment in which to talk about intimate questions 1515. Shaw C, Tansey R, Jackson C, Hyde C, Allan R. Barriers to help seeking in people with urinary symptoms. Fam Pract. 200; 18:48-52.,1616. Kraemer HC. Measurement of reliability for categorical data in medical research. Stat Methods Med Res 1992; 1:183-99.. The perception of "invasion of privacy" may lead the respondents to admit a problem in a first interview but deny it in the second one55. Hasselmann MH, Lopes CS, Reichenheim ME. Measurement reliability in a study on family violence and severe acute malnutrition. Rev Saúde Pública 1998; 32:437-46..

All the above factors may have influenced our results. The SAQ are inherently subjective and susceptible to communication problems. The CGA is more likely to generate a context of confidence between the professional and the individual examined allowing him or her to assume difficulties such as urinary incontinence. Additionally, it is important to underline that our study was conducted with elderly people with little or no formal education, in which age and educational factors influence cognitive performance1111. Zajacova A, Dowd JB. Reliability of self-rated health in US adults. Am J Epidemiol 2011; 174:977-83..

The screening for hearing impairment with the use of a single question has been recommended in Brazil based on studies that have evaluated the sensitivity and specificity of the single item1010. Valete-Rosalino CM, Rozenfeld S. Auditory screening in the elderly: comparison between self-report and audiometry. Braz J Otorhinolaryngol 2005; 71:193-200.,1717. Nondahl DM, Cruickshanks KJ, Wiley TL, Tweed TS, Klein R, Klein BEK. Accuracy of self-reported hearing loss. Audiology 1998; 37:295-301., but only one study has examined its IRR and found a kappa coefficient of 0.65 1818. Tomioka K, Ikeda H, Hanaie K, Morikawa M, Iwamoto J, Okamoto N, et al. The Hearing Handicap Inventory for Elderly-Screening (HHIE-S) versus a single question: reliability, validity, and relations with quality of life measures in the elderly community, Japan. Qual Life Res 2013; 22:1151-9.. In relation to the SPH, two studies have assessed only test-retest reliability, but even not examining IRR these studies have identified discrepancies between assessments in racial and ethnic minorities, individuals with lower levels of education and the elderly1111. Zajacova A, Dowd JB. Reliability of self-rated health in US adults. Am J Epidemiol 2011; 174:977-83.,1919. Crossley TF, Kennedy S. The reliability of self-assessed health status. J Health Econ 2002; 21:643-58.. Regarding urinary incontinence, the social stigma related to the problem may have contributed to our results 1515. Shaw C, Tansey R, Jackson C, Hyde C, Allan R. Barriers to help seeking in people with urinary symptoms. Fam Pract. 200; 18:48-52.. Finally, falls, in the life of an elderly person lead to functional decline, what justifies more precise information about them, even by individuals with low levels of education. In our study, it was the only subjective question with good IRR.

Our study has limitations. It is known that observers tend to change their approaches over time so that planning a periodical replication is necessary55. Hasselmann MH, Lopes CS, Reichenheim ME. Measurement reliability in a study on family violence and severe acute malnutrition. Rev Saúde Pública 1998; 32:437-46.. This has not occurred here. Furthermore, we assessed elderly people with low levels of education in a primary care setting and our results should be applicable only to similar populations.

Aging often results in insidious changes in functional capacity. Screening for health problems allow examining a large portion of elderly, indicating those that should be submitted to CGA. Our study revealed that single SAQ addressing SPH, urinary incontinence, and hearing loss had poor reliability in older adults. Although high item reproducibility does not guarantee high accuracy, it is clear that subjective self-assessments with low reliability should be reviewed before implementation.

Acknowledgments

The authors wish to thank Maria José Barbosa Lima, Mônica Bastos de Lima Barros and Soraya Atiê for their contributions in the critical revision of the text prior to publication. We would also like to thank Flavio Henrique Lino for assistance in writing the article.

References

  • 1
    Schmidt MI, Duncan BB, Azevedo e Silva G, Menezes AM, Monteiro CA, Barreto SM, et al. Chronic non-communicable diseases in Brazil: burden and current challenges. Lancet 2011; 377:1949-61.
  • 2
    Sousa RM, Ferri CP, Acosta D, Guerra M, Huang Y, Jacob K, et al. The contribution of chronic diseases to the prevalence of dependence among older people in Latin America, China and India: a 10/66 Dementia Research Group population-based survey. BMC Geriatr 2010; 10:53.
  • 3
    Rubenstein LZ, Alessi CA, Josephson KR, Trinidad Hoyl M, Harker JO, Pietruszka FM. A randomized trial of a screening, case finding, and referral system for older veterans in primary care. J Am Geriatr Soc 2007; 55:166-74.
  • 4
    Ministério da Saúde. Avaliação multidimensional rápida da pessoa idosa. In: Departamento de Atenção Básica, Secretaria de Atenção à Saúde, Ministério da Saúde, organizador. Envelhecimento e saúde da pessoa idosa. Brasília: Ministério da Saúde; 2007. p. 48-9.
  • 5
    Hasselmann MH, Lopes CS, Reichenheim ME. Measurement reliability in a study on family violence and severe acute malnutrition. Rev Saúde Pública 1998; 32:437-46.
  • 6
    Gordis L. Epidemiologia. 4a Ed. Rio de Janeiro: Revinter; 2010.
  • 7
    Carvalho MAP, Pivetta F. The integrated territory of health care in Manguinhos: we are all apprentices. Rio de Janeiro: Fundação Oswaldo Cruz; 2012.
  • 8
    Li C, Friedman B, Conwell Y, Fiscella K. Validity of the Patient Health Questionnaire 2 (PHQ-2) in identifying major depression in older people. J Am Geriatr Soc 2007; 55:596-602.
  • 9
    Pereira LSM, Gomes G. Avaliação funcional. In: Guimarães RM, Cunha UGV, editors. Sinais e sintomas em geriatria. São Paulo: Atheneu; 2004. p. 17-30.
  • 10
    Valete-Rosalino CM, Rozenfeld S. Auditory screening in the elderly: comparison between self-report and audiometry. Braz J Otorhinolaryngol 2005; 71:193-200.
  • 11
    Zajacova A, Dowd JB. Reliability of self-rated health in US adults. Am J Epidemiol 2011; 174:977-83.
  • 12
    Abramson J. WINPEPI (PEPI-for-Windows): computer programs for epidemiologists. Epidemiol Perspect Innov 2009; 1:6.
  • 13
    Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33:159-74.
  • 14
    Shrout PE. Measurement reliability and agreement in psychiatry. Stat Methods Med Res 1998; 7:301-17.
  • 15
    Shaw C, Tansey R, Jackson C, Hyde C, Allan R. Barriers to help seeking in people with urinary symptoms. Fam Pract. 200; 18:48-52.
  • 16
    Kraemer HC. Measurement of reliability for categorical data in medical research. Stat Methods Med Res 1992; 1:183-99.
  • 17
    Nondahl DM, Cruickshanks KJ, Wiley TL, Tweed TS, Klein R, Klein BEK. Accuracy of self-reported hearing loss. Audiology 1998; 37:295-301.
  • 18
    Tomioka K, Ikeda H, Hanaie K, Morikawa M, Iwamoto J, Okamoto N, et al. The Hearing Handicap Inventory for Elderly-Screening (HHIE-S) versus a single question: reliability, validity, and relations with quality of life measures in the elderly community, Japan. Qual Life Res 2013; 22:1151-9.
  • 19
    Crossley TF, Kennedy S. The reliability of self-assessed health status. J Health Econ 2002; 21:643-58.
  • 20
    Fried LP, Tangen CM, Walston J, Newman AB, Hirsch C, Gottdiener J, et al. Frailty in older adults: evidence for a phenotype. J Gerontol A Biol Sci Med Sci 2001; 56:M146-56.

Publication Dates

  • Publication in this collection
    Dec 2014

History

  • Received
    14 Feb 2014
  • Reviewed
    01 Sept 2014
  • Accepted
    22 Sept 2014
Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz Rio de Janeiro - RJ - Brazil
E-mail: cadernos@ensp.fiocruz.br