Greg FeganI,*; Michael MoulsdaleI; Jim ToddII
IKenya Medical Research Institute, Centre for Geographic Medicine Research (Coast), PO Box 230, Kilifi, 80108, Kenya
IINational Institute of Medical Research, Tazama Project, Mwanza, United Republic of Tanzania
As professionals in data processing, analysis and information technology, we read with interest the Bulletin's coverage of the barriers to data sharing in public health.1 We contend that there are many existing solutions for low-cost, high-quality data collection, management and analysis. Many of these systems are built on open-source technologies and thus are more amenable to receiving input for their design and operation from information technologists and researchers in low-income countries. The information technology gap between rich and poor countries may not be as large as some may think. For example, a recent map of Facebook "friend" linkages shows areas of high internet connectivity in low-income countries, with specific interest to us being the connection into Rwanda from Mombasa, Kenya.2
In our institutions, after some consideration,3 we adopted a web-based, open source clinical trials package called OpenClinica (Akaza Research, Waltham, MA, United States of America) for a large (n > 3000) multi-site trial (more informa tion available at: http://www.feast-trial.org). For observational epidemiological and clinical studies that rely on, or are derived from, surveillance systems there are several free, easy-to-install systems such as RedCap,4 OpenMRS5 and OpenXdata. Indeed, there are novel technologies that have come from low-income countries that are now in use in high-income countries, e.g. software developer Ushahidi.6 As Tom Smith of the Swiss Tropical Institute said at the Pan-African Malaria Conference in 2009 when introducing a presentation on a mobile phone-based system for malaria surveillance in Zanzibar, United Republic of Tanzania: "Africa is in the vanguard in the use of such technology." Indeed the Kenyan mobile phone-based money transfer system, M-Pesa, is highly regarded and has spread very quickly.7 Given the ubiquity of mobile phones, the use of developing technologies, such as lens-free microscopy,8 are likely to have major impacts soon on disease surveillance in low-income countries.
With such technologies comes the need for a tool to effectively analyse and disseminate the data generated. One such tool is the open-source software package, simply named R, arguably9 the most rapidly evolving and powerful data analytical engine with which the major statistics software packages can integrate. A most useful input from the World Health Organization and the Special Programme for Research and Training in Tropical Diseases has been sponsorship of the online R course at Thailands' Prince of Songkla University and publication of a free book. R's potential for data storage, use and analysis is well illustrated by Zack Almquist10 who has created a freely available set of tools that interrogates data from the year 2000 census in the United States of America.
Pisani & AbouZahr discuss how collaboration allows researchers to stand on the shoulders of giants.1 We think this R package does just that. The field of genetics, which they cite as one that public health and epidemiological researchers should emulate, has been greatly assisted by R through the BioConductor project. Robert Gentleman, who developed R with Ross Ihaka,11 is a notable contributor to this project.12
With powerful data programmes freely available, data can be shared more widely, but this also depends on skilled personnel. In east Africa alone, we need to train 100 new data managers and statisticians every year so that researchers, data managers and analysts can benefit from the wider availability of data and to ensure its quality. Public health researchers need to be trained in good data collection methods, mentoring with researchers and partnering with data analysis experts from developed countries. Systems are required that extend beyond research projects into the collection of routine data for public health systems that can be used to monitor the health of the whole population.
We think technology and shared data have the potential to radically transform the health systems of low-income countries within our lifetime. We believe that technology and data should be shared equitably across all countries and that everyone should be enabled to use the results from the acquired knowledge.
2. Smith D. Facebook's social network graph. R-Bloggers, 14 December 2010. Available from: http://www.r-bloggers.com/facebooks-social-network-graph/ [accessed 10 January 2011] .
4. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap) - a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42:377-81. doi:10.1016/j.jbi.2008.08.010 PMID:18929686
6. Ushahidi [corporate website]. Available from: http://ushahidi.com/ [accessed 10 January 2011] .
7. Mas I, Radcliffe D. Mobile payments go viral: M-PESA in Kenya. Washington DC: The World Bank; 2010. Available from: http://www.microfinancegateway.org/gm/document-1.9.43376/Mobile%20Payments%20Go%20Viral_M-PESA%20in%20Kenya.pdf [accessed 10 January 2011] .
9. Wilkinson L. The future of statistical computing. Technometrics 2008;50:418-35. doi:10.1198/004017008000000460
10. Almquist ZW. US Census spatial and demographic data in R: the UScensus2000 suite of packages. J Stat Softw 2010;37:1-31.
11. Vance A. Data analysts captivated by R's power. New York Times, 6 January 2009.
12. Gentleman R, Temple Lang D. Statistical analyses and reproducible research. J Comput Graph Statist 2007;16:1-23. doi:10.1198/106186007X178663