Classification and regression trees for predicting the risk of a negative test result for tuberculosis infection in Brazilian healthcare workers: a cross-sectional study

Objectives: Healthcare workers (HCWs) have a high risk of acquiring tuberculosis infection (TBI). However, annual testing is resource-consuming. We aimed to develop a predictive model to identify HCWs best targeted for TBI screening. Methodology: We conducted a secondary analysis of previously publi...

Full description

Saved in:

Bibliographic Details
Published in	Revista brasileira de epidemiologia Vol. 24; p. e210035
Main Authors	Souza, Fernanda Mattos, Prado, Thiago Nascimento do, Werneck, Guilherme Loureiro, Luiz, Ronir Raggio, Maciel, Ethel Leonor Noia, Faerstein, Eduardo, Trajman, Anete
Format	Journal Article
Language	English Portuguese
Published	Associação Brasileira de Saúde Coletiva 2021 Associação Brasileira de Pós-Graduação em Saúde Coletiva
Subjects	Decision trees Latent tuberculosis Machine learning Occupational risks PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Tuberculose latente Machine learning Latent tuberculosis Aprendizado de máquina Decision trees Occupational risks Riscos ocupacionais Árvores de decisões
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Objectives: Healthcare workers (HCWs) have a high risk of acquiring tuberculosis infection (TBI). However, annual testing is resource-consuming. We aimed to develop a predictive model to identify HCWs best targeted for TBI screening. Methodology: We conducted a secondary analysis of previously published results of 708 HCWs working in primary care services in five Brazilian State capitals who underwent two TBI tests: tuberculin skin test and Quantiferon®-TB Gold in-tube. We used a classification and regression tree (CART) model to predict HCWs with negative results for both tests. The performance of the model was evaluated using the receiver operating characteristics (ROC) curve and the area under the curve (AUC), cross-validated using the same dataset. Results: Among the 708 HCWs, 247 (34.9%) had negative results for both tests. CART identified that physician or a community health agent were twice more likely to be uninfected (probability = 0.60) than registered or aid nurse (probability = 0.28) when working less than 5.5 years in the primary care setting. In cross validation, the predictive accuracy was 68% [95% confidence interval (95%CI): 65 - 71], AUC was 62% (95%CI 58 - 66), specificity was 78% (95%CI 74 - 81), and sensitivity was 44% (95%CI 38 - 50). Conclusion: Despite the low predictive power of this model, CART allowed to identify subgroups with higher probability of having both tests negative. The inclusion of new information related to TBI risk may contribute to the construction of a model with greater predictive power using the same CART technique. RESUMO: Objetivos: Desenvolver um modelo preditivo para identificar profissionais de saúde com maior probabilidade de resultado negativo para dois testes de diagnóstico da infecção latente por Mycobacterium tuberculosis (ILTB). Métodos: Foi realizada uma análise secundária dos resultados publicados anteriormente de 708 profissionais de saúde da atenção primária, de cinco capitais brasileiras, submetidos à prova tuberculínica e ao Quantiferon®-TB Gold in-tube. Um modelo preditivo com árvore de classificação e regressão (CART, Classification and regression tree) foi construído. A avaliação do desempenho foi realizada por meio da análise receiver operating characteristics (ROC) e area under the curve (AUC). Utilizamos o mesmo banco de dados para validação cruzada do modelo. Resultados: Entre os 708 profissionais de saúde, 247 (34,9%) apresentaram resultado negativo para os testes. A CART identificou que os médicos e agentes comunitários de saúde apresentaram duas vezes mais chances de não estarem infectados (probabilidade = 0,60) que os enfermeiros e técnicos/auxiliares de enfermagem (probabilidade = 0,28) nos casos com menos de 5,5 anos de atuação na atenção primária. Na validação cruzada, a acurácia do modelo preditivo foi de 68% [intervalo de confiança de 95% (IC95%) 65 - 71)], AUC de 62% (IC95% 58 - 66), especificidade de 78% (IC95% 74 - 81) e sensibilidade de 44% (IC95% 38 - 50). Conclusão: Apesar do baixo poder preditivo do modelo, a CART permitiu identificar subgrupos com maior probabilidade de terem ambos os testes negativos. A inclusão de novas informações relacionadas ao risco de ILTB pode contribuir para a construção de um modelo com maior poder preditivo utilizando a mesma técnica.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1415-790X 1980-5497 1980-5497
DOI:	10.1590/1980-549720210035