Artificial Intelligence and emergencies
Onderdeel van
Trefwoorden
An automated approach to extract information from the literature to tackle pathogen contamination events.
In a pathogen contamination event, ensuring the health of the public should be the top priority in every emergency response mechanism. The immediate control of such incidents is of utmost importance since there is usually a short time to act. Research and innovation are slow to penetrate to protocols and tools related to waterborne pathogen contamination events, as is seen in the current COVID-19 pandemic that raises concerns and questions about the level of preparedness and the health risks of the public.
With the fast-growing rate of scientific publications, the contained information is often buried under the large volume of journal articles. In the field of environmental microbiology, although knowledge and research are abundant in the literature, scientists and experts cannot constantly stay informed on the latest developments. That is why the responsible authorities have uttered the need for a holistic approach to handle waterborne pathogen contamination events by getting immediate access to up-to-date, science-based information. For this purpose, the EU-funded PathoCERT project is launched with the aim of increasing the coordination capability of the responsible authorities. One of the sub-objectives of the project is to develop an Artificial Intelligence (AI) system that extracts information from scientific publications on pathogen characteristics. This automated approach will help them to quickly gain important information, thus enabling them to assess the health risk from a pathogen contamination event and identify potential control actions.
The objective of this research is twofold. Firstly, we want to determine whether it is feasible to extract both qualitative and quantitative information from scientific publications about a waterborne pathogen (Legionella) using Machine Learning (ML) and Text Mining (TM) techniques. Secondly, we want to assess the quality of the extracted information. Legionella was selected, considering that it is a well-known waterborne pathogen that is frequently associated with outbreaks. A Proof of Concept (POC) was utilized to determine whether an Information Extraction (IE) task (a principal subfield of TM) can extract Information Keywords (IK) from scientific publications such as “Incubation period”, “Source of exposure”, “Route of transmission”, “Symptoms”, “Clinical manifestation”, “Species”, and ” Environmental habitat “.
The POC suggested that the system effectively extracted the desired IK. The evaluation of the POC was made using the analytical metrics of precision, and recall which returned a score of 0.91, 0.80, and 0.85 respectively. The high overall scores indicated that the system captures and predicts the IK correctly. The comparison of the system’s performance with manual extraction of information on 10 new scientific publications substantiated this conclusion as similar results were observed, indicating that the quality of the extracted information is adequate.
Overall, the proposed system showed AI could reliably extract both qualitative and quantitative information keywords about Legionella from scientific literature. Our study paved the way for a better understanding of the processes, specifics, and boundary conditions of the desired information, and is considered a first step towards the extraction of information on waterborne pathogens that can help experts in decision-making in an emergency event.
Source: Paraskevopoulos, S. (KWR Water Research Institute, The Netherlands). 2021. Artificial Intelligence and emergencies: An automated approach to extract information from the literature to tackle pathogen contamination events. Data-driven pathways to approach water-related disasters and challenges, Risk & Resilience. AIWW 2021.
In-person Conference
Afbeelding credits
Icon afbeelding: PIxabay - Dry lake