Development and external validation of a logistic regression derived algorithm to estimate a 12-month post open defecation free slippage risk

Keywords: Keywords, Development and validation prognostic model, Risk algorithm, CLTS, ODF slippage risk, WASH sustainability, toilets


Appropriate open defecation free (ODF) sustainability interventions are key to further mobilise communities to consume sanitation and hygiene products and services that enhance household’s quality of life and embed household behavioural change for heathier communities. This study aims to develop a logistic regression derived risk algorithm to estimate a 12-month ODF slippage risk and externally validate the model in an independent data set. ODF slippage occurs when one or more toilet adequacy parameters are no longer present for one or more toilets in a community. Data in the Zambia district health information software for water sanitation and hygiene management information system for Chungu and Chabula chiefdoms was used for the study. The data was retrieved from the date of chief Chungu and Chabula chiefdoms' attainment of ODF status in October 2016 for 12 months until September 2017 for the development and validation data sets respectively. Data was assumed to be missing completely at random and the complete case analysis approach was used. The events per variables were satisfactory for both the development and validation data sets. Multivariable regression with a backwards selection procedure was used to decide candidate predictor variables with p < 0.05 meriting inclusion. To correct for optimism, the study compared amount of heuristic shrinkage by comparing the model’s apparent C-statistic to the C- statistic computed by nonparametric bootstrap resampling. In the resulting model, an increase in the covariates ‘months after ODF attainment’, ‘village population’ and ‘latrine built after CLTS’, were all associated with a higher probability of ODF slippage. Conversely, an increase in the covariate ‘presence of a handwashing station with soap’, was associated with reduced probability of ODF slippage. The predictive performance of the model was improved by the heuristic shrinkage factor of 0.988. The external validation confirmed good prediction performance with an area under the receiver operating characteristic curve of 0.85 and no significant lack of fit (Hosmer-Lemeshow test: p = 0.246). The results must be interpreted with caution in regions where the ODF definitions, culture and other factors are different from those asserted in the study.

Author Biography

Warren Mukelabai Simangolwa, SNV Netherlands Development Organisation
Warren M.S; co-author to the chapter “CLTS and sanitation marketing: aspects to consider for a better integrated approach” in the book sustainable sanitation for all: experiences, challenges and innovations, is a seasoned innovative WASH systems strengthening design, rural-urban & peri-urban faecal sludge management value chain, supply chains, inclusive business, sanitation marketing, inclusive technology development, governance & policy, infrastructure, emergency & fragile regions, CLTS for rural and peri-urban and last mile approaches researcher and practitioner


Bennett, D. A., 2001. How can I deal with missing data in my study?. Australian and New Zealand Journal of Publich Health, 25(5), pp. 464-469.

Bodner, T., 2008. What Improves with Increased Missing Data Imputations?. A Multidisciplinary Journal, 15(4), pp. 651-675.

Bongartz, N. V. a. P., 2016. Going beyond open defecation free. In: N. V. a. J. F. (. P. Bongartz, ed. Sustainable Sanitation for All: Experiences, Challenges, and Innovations. Rugby: Practical Action,, pp. 1-3.

Collins, G. S., Ogundimu,, E. O. & Altman , D. G., 2016 . Sample size considerations for the external validation of a multivariable prognostic model: a resampling study. Statistics in Medicine, 35(2), p. 214–226..

Crocker, J., Saywell, D. & Bartr, J., 2017. Sustainability of community-led total sanitation outcomes: Evidence from Ethiopia and Ghana. International Journal of Hygiene and Environmental Health, 220(3), p. 551–557.

CSO, 2010. 2010 census of population and housing, Lusaka: The Central Statistical Office .

Habbema, J., EW., S., MJ., E. & FE., H. J., 2001. Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets. Med Decis Making, Volume 21, p. 45–56.

Harrell, F. & Mark, D., 1996. Tutorial in Biostatistics: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy and measuring and reducing errors. Statist Med, Issue 15, pp. 361-387.

Hendriksen, J., GJ, G., Moons KGM, K. & De, G., 2013. Diagnostic and prognostic prediction models. J Thromb Haemost, (Suppl. 1)(11), p. 129–41..

Heymans, M., 2015. PROGNOSTIC AND DIAGNOSTIC MODELS. quality handbook that was developed by the EMGO+ Institute for Health and Care Research, Volume V2, pp.

Hosmer, D. W. & Lemeshow, S., 2000. Applied Logistic Regression. Second ed. New York: John Wiley & Sons, Inc..

Hutton, G. & Haller, L., 2012. Global costs and benefits of drinking-water supply and sanitation interventions to reach the MDG target and universal coverage, Geneva: World Health Organisation.

Jinks, R. C., 2012. Sample Size for Multivariable Prognostic Models. [Online]

Available at: [Accessed 3 March 2018].

JMP, 2017. Progress on Drinking Water, Sanitation and Hygiene: 2017 Update and SDG Baselines, Geneva: World Health Organization (WHO) and the United Nations Children’s Fund (UNICEF), 2017.

Kang, H., 2013. The prevention and handling of the missing data. Korean Journal of anesthesilogy, May, 64(5), p. 402–406.

Kanyamuna, B. M., 2010. The Impacct of Implimentating the D-WASHE Programmes in Chanyaya Community-Kafue District, Zambia: What role has National Water Policy (1994), played?. [Online]

Available at: [Accessed 18 April 2018].

Khale, M. & Ashok , D., 2008. The impact of rural sanitation on water quality and waterborne diseases, London: L. Mehta and S. Movik Shit Matters: The Potential of Community-Led Total Sanitation, Practical Action.

Lixil, WaterAid Japan & Oxford Economics, 2016. The true cost of poor sanitation, Tokyo, Japn: Lixil.

Lungu, C. & Harvey, P., 2009. Multi¬sectoral decentralized water and sanitation provision in Zambia: Rhetoric and reality. Addis Ababa, Ethiopia, WEDC, p. PAPER 124.

Mara, D., Lane, J., Scott, B. & Trouba, D., 2010. Sanitation and Health. PLoS Med, 7(11), p. e1000363.doi:10.1371/journal.pmed.1000363.

Markle, L. et al., 2017. A Mobile Platform Enables Unprecedented Sanitation Uptake in Zambia. PLoS Negl Trop Dis, 11(1), p.

Mukaka, M. et al., 2016. Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?. Trials, 17(341), pp. doi: 10.1186/s13063-016-1473-3.

Munkhondia, T., Simangolwa, W. M. & Maceda, A. Z., 2018. CLTS and sanitation marketing: aspects to consider for a better integrated approach. In: N. V. a. J. F. Petra Bongartz, ed. Sustainable Sanitation for All Experiences, challenges, and innovations. Rugby, UK: Practical Action Publishing Ltd, <>, pp. 100, 101.

NRWSSP, 2011. ODF Certificatio Procedure: National Rural Water and Sanitation Programme , Lusaka: Ministry of Local Government, Housing, Ealry Education and Environmental Protection.

Odagiri, M. et al., 2017. Enabling Factors for Sustaining Open Defecation-Free Communities in Rural Indonesia: A Cross-Sectional Study. International journal of environmental research and public health, 14(1572), pp. 15-18.

Pavlou, M. et al., 2015. How to develop a more accurate risk prediction model when there are few events. BMJ, 351(h3868), p. doi: 10.1136/bmj.h3868.

Peng, C.-Y. J. H. M. L. S.-M. &. E. L. H., 2006. Advances in missing data methods and implications for educational reseach. In: I. S. S. (Ed.), ed. Real data analysis . Greenwich : Information Age., p. 31–78).

Prüss-Üstün, A., Bos, R., Gore, F. & Bartram, J., 2008. Safer water, better health: costs, benefits and sustainability of interventions to protect and promote health, Geneva: World Health Organization.

Royston, P., Moons, K., Altman, D. & Y., V., 2009. Prognosis and prognostic research: developing a prognostic model. BMJ, p. 338.

Sharmani , B. et al., 2013. Impact of Indian Total Sanitation Campaign on Latrine Coverage and Use: A Cross-Sectional Study in Orissa Three Years following Programme Implementation. PLoS ONE, 8(8 ), p. e71438. doi:10.1371/journal.pone.0071438.

Shivanarain, S. & Nancy, B., 2015. Sustainability of ODF Practices in Kenya. About the UNIC EF Eastern and Southern Africa Sanitation Learning Series, November , WASH field Note(November ), p.

Sinha, A. et al., 2017. Assessing patterns and determinants of latrine use in rural settings: A longitudinal study in Odisha, India. International Journal of Hygiene and Environmental Health, 220(5), p. 906–915.


SNV, 2018. SSH4A results programme endline brief [Practice Brief]. Kasama: Netherlands Developemnt Organisation.

Spratt, M. et al., 2010. Strategies for Multiple Imputation in Longitudinal Studies. American Journal of Epidemiology, 172(4), p. DOI: 10.1093/aje/kwq137.

Steyerberg, E. W., 2009. Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. Springer.

Thomas, A., 2016. Strengthening post-ODF programming:reviewing lessons from sub-Saharan Africa. In: P. Bongartz, N. Vernon & J. Fox, eds. Sustainable Sanitation for All Experiences, challenges, and innovations. Rugby, UK: Practical Action Publishing <>, p. 84.

Tyndale-Biscoe, P., Bond , M. & Kidd, . R., 2013. ODF Sustainability Study, FH Designs: Plan International.

UCLA, 2018. MULTIPLE IMPUTATION IN STATA. Institute for Digital Research and Education, 13 Accessed in March, p.

UNICEF, 2015. Sustainability of ODF Practices in Kenya. UNICEF Eastern and Southern Africa Sanitation Learning Se ries, November, p. 5.

Van Houwelingen, H. C. & Le Cessie, S., 2001. Shrinkage and Penalized Likelihood as Methods to Improve. Statistica neerlandica, 21 December, pp. 17-34.

Van Minh, H. & Hung, N.-V., 2011. Economic Aspects of Sanitation in Developing Countries. Environmental Health Insights, 5(, p. 63–70.

Vergouw, D. et al., 2012. Missing Data and Imputation: A Practical Illustration in a Prognostic Study on Low Back Pain. Journal of manupulative and therapeutics, 35(6), p. 464–471.

WHO, 2009. World Health Organization . Global health risks: mortality and burden of disease attributable to selected major risks, Geneva: World Health Organization.

Wikipedia contributors, 2018. Lake Bangweulu. Wikipedia, The Free Encyclopedia, 26 April, pp. Retrieved 19:20, May 25, 2018, from