TY - JOUR
T1 - Automating COVID-19 epidemiological situation reports based on multiple data sources, the Netherlands, 2020 to 2023
AU - de Oliveira Bressane Lima, Priscila
AU - van de Kassteele, Jan
AU - Schipper, Maarten
AU - Smorenburg, Naomi
AU - S․ van Rooijen, Martijn
AU - Heijne, Janneke
AU - D․ van Gaalen, Rolina
N1 - Publisher Copyright:
© 2024
PY - 2024/12/1
Y1 - 2024/12/1
N2 - Background: During the COVID-19 pandemic, the National Institute for Public Health and the Environment in the Netherlands developed a pipeline of scripts to automate and streamline the production of epidemiological situation reports (epi‑sitrep). The pipeline was developed for the Automation of Data Import, Summarization, and Communication (hereafter called the A-DISC pipeline). Objective: This paper describes the A-DISC pipeline and provides a customizable scripts template that may be useful for other countries wanting to automate their infectious disease surveillance processes. Methods: The A-DISC pipeline was developed using the open-source statistical software R. It is organized in four modules: Prepare, Process data, Produce report, and Communicate. The Prepare scripts set the working environment (e.g., load packages). The (data-specific) Process data scripts import, validate, verify, transform, save, analyze, and summarize data as tables and figures and store these data summaries. The Produce report scripts gather summaries from multiple data sources and integrate them into a RMarkdown document – the epi‑sitrep. The Communicate scripts send e-mails to stakeholders with the epi‑sitrep. Results: As of March 2023, up to ten data sources were automatically summarized into tables and figures by A-DISC. These data summaries were featured in routine extensive COVID-19 epi‑sitreps, shared as open data, plotted on RIVM's website, sent to stakeholders and submitted to European Centre for Disease Prevention and Control via the European Surveillance System -TESSy [38]. Discussion: In the face of an unprecedented high number of cases being reported during the COVID-19 pandemic, the A-DISC pipeline was essential to produce frequent and comprehensive epi‑sitreps. A-DISC's modular and intuitive structure allowed for the integration of data sources of varying complexities, encouraged collaboration among people with various R-scripting capabilities, and improved data lineage. The A-DISC pipeline remains under active development and is currently being used in modified form for the automatization and professionalization of various other disease surveillance processes at the RIVM, with high acceptance from the participant epidemiologists. Conclusion: The A-DISC pipeline is an open-source, robust, and customizable tool for automating epi‑sitreps based on multiple data sources.
AB - Background: During the COVID-19 pandemic, the National Institute for Public Health and the Environment in the Netherlands developed a pipeline of scripts to automate and streamline the production of epidemiological situation reports (epi‑sitrep). The pipeline was developed for the Automation of Data Import, Summarization, and Communication (hereafter called the A-DISC pipeline). Objective: This paper describes the A-DISC pipeline and provides a customizable scripts template that may be useful for other countries wanting to automate their infectious disease surveillance processes. Methods: The A-DISC pipeline was developed using the open-source statistical software R. It is organized in four modules: Prepare, Process data, Produce report, and Communicate. The Prepare scripts set the working environment (e.g., load packages). The (data-specific) Process data scripts import, validate, verify, transform, save, analyze, and summarize data as tables and figures and store these data summaries. The Produce report scripts gather summaries from multiple data sources and integrate them into a RMarkdown document – the epi‑sitrep. The Communicate scripts send e-mails to stakeholders with the epi‑sitrep. Results: As of March 2023, up to ten data sources were automatically summarized into tables and figures by A-DISC. These data summaries were featured in routine extensive COVID-19 epi‑sitreps, shared as open data, plotted on RIVM's website, sent to stakeholders and submitted to European Centre for Disease Prevention and Control via the European Surveillance System -TESSy [38]. Discussion: In the face of an unprecedented high number of cases being reported during the COVID-19 pandemic, the A-DISC pipeline was essential to produce frequent and comprehensive epi‑sitreps. A-DISC's modular and intuitive structure allowed for the integration of data sources of varying complexities, encouraged collaboration among people with various R-scripting capabilities, and improved data lineage. The A-DISC pipeline remains under active development and is currently being used in modified form for the automatization and professionalization of various other disease surveillance processes at the RIVM, with high acceptance from the participant epidemiologists. Conclusion: The A-DISC pipeline is an open-source, robust, and customizable tool for automating epi‑sitreps based on multiple data sources.
KW - Automated infectious diseases monitoring
KW - Automated infectious diseases surveillance
KW - COVID-19
KW - Epidemic response
KW - Epidemiological situation reports
KW - R scripts pipeline
UR - https://www.scopus.com/pages/publications/85204963128
U2 - 10.1016/j.cmpb.2024.108436
DO - 10.1016/j.cmpb.2024.108436
M3 - Article
C2 - 39342878
SN - 0169-2607
VL - 257
JO - Computer methods and programs in biomedicine
JF - Computer methods and programs in biomedicine
M1 - 108436
ER -