In the last two years, a period that feels like a lifetime to some of us, the COVID-19 pandemic has changed our world significantly and almost instantaneously. Hospitals, schools, businesses and countless organizations across the globe, including our community here at BU, were forced to quickly find new ways to operate. As organizations turned to digital technologies and innovations for solutions, AI with all its subfields such as machine learning, data science, and neighboring fields like automation, came under the spotlight. After two years, let us now take a look at the role AI played in the pandemic from various perspectives, from diagnosis and forecasting to monitoring and treatment, from organizational and public health management to ethical and human rights issues brought up due to AI applications. We should also examine how new AI technologies developed to cope with COVID-19 may affect the work environment in the future. In particular, we will focus on three main areas, each with its dominating AI approach: diagnostics and image-based AI, monitoring and text-based AI, and forecasting and model-based AI.
Image processing has been one of the most developed machine learning areas. Its applications are ubiquitous: we have all seen an app that increased the quality of the photographs we took with our phones, or an app that neatly grouped selfies versus holiday memories. We have witnessed the swift progress in self driving car technology where on-the-go object recognition is a must. By the time we were introduced to SARS-CoV-2, several medical imaging tasks were already tackled with machine learning, e.g. radiologist-level pneumonia detection on chest x-rays with neural networks was already published in 2017 (Ref 1). Hence it is no surprise that when the pandemic hit, the AI community was ready to use all image processing tools for COVID-19 diagnosis using radiology or genomic images. Pre-existing approaches were quickly adapted to detect signals of COVID-19 and many promising results that can be used to aid diagnosis as well as prognosis were obtained.
One can easily expect that developments like these that bolster AI’s success in image-based operations have converted a few more skeptics among the business leaders and convinced a few more workers that tomorrow, any job that depends on vision may rely on AI assistance. And this AI assistance may cut both ways for society: making workspaces safer (for example, by automating helmet or mask detection), and making systems more fail-proof thanks to early detection (for example, identifying a weak equipment in factory operations or deteriorating part of powergrid lines before a snow storm hits), it would increase the productivity of workers that agree to work with AI while also pushing businesses and workers who refuse to/cannot adapt to do so out of the market.
It is important to note that with COVID-19, what stood in the way of even faster progress in image-based diagnostics was mostly availability of ‘labeled’ data from patients affected by SARS-CoV-2, rather than capacity of the methods to process the image, or the time and know-how that takes to build machine learning approaches. The medical community has since acknowledged that cross-field and cross-institution teamwork for rapid data acquisition of high quality data is crucial to handle future health challenges successfully. (Ref 2)
So we can say that as AI changed our approach to COVID-19, so will COVID-19 change our AI infrastructure, by securing faster and more reliable data systems at first.
Monitoring and Surveillance
An important victory for AI came in the area of monitoring, before the world knew about COVID-19: On December 31, 2019, an infectious disease intelligence company called “BlueDot”, founded by a University of Toronto researcher, alerted its customers to an outbreak of a new pneumonia-like disease, seven days earlier than the official alert from the US Centers for Disease Control and Prevention (CDC). BlueDot and similar AI companies’ success is based on natural language processing, a subfield of machine learning that identifies and analyzes patterns within unstructured text/language data. Specifically, BlueDot used real time official health reports together with a vast amount of unofficial reports on the internet (social media, health blogs etc) as well as air travel information in multiple languages to identify the name of a pathogen, its location and contextual data about its nature and spread.
Natural Language Processing (NLP) has long been an interest for the medical community due to the large amount of text records and potential benefit of analyzing them to identify medical information. The impact of COVID-19 has been to significantly accelerate the progress of NLP applications in many areas that overlap between NLP, medicine and public health. Pretrained, weakly supervised, self-supervised or unsupervised NLP methods that require much less data labelling, a crucial requirement in the midst of a pandemic, have garnered major attention among academics. Successful applications of information retrieval by mining the scientific literature (Ref 3), or patient reports (Ref 4) have proven feasible. NLP of scientific literature has great potential in the task of finding treatments, a topic we will touch on later, while NLP of patient records can be instrumental in understanding changes in the disease and as well as monitoring new variants before the genetic sequencing data becomes available.
Other NLP approaches that are paramount to monitoring and surveillance are about public sentiments and “infodemic”, the spread of misinformation. AI models such as COVID-Twitter-BERT (Ref 5) have been demonstrated to successfully analyze public sentiment and classify them, a technology that can be used to aid policy makers in monitoring the public’s response to health measures. Similarly successful NLP models that are built on databases such as SciFact (Ref 6) or Covid-HeRa (Ref 7) are trained to battle the infodemic, e.g. by classifying a claim as true or false, providing supporting or refuting evidence, or by predicting the severity of the misinformation.
Finally, the discussion about what AI has done for COVID-19 monitoring and surveillance would not be complete without mentioning population-intensive techniques, for example massive scale measurement of temperature in public spaces or social distancing and travel monitoring based on mobile phone data. Unlike the texts in social media where information is voluntarily shared and easy to anonymize, these population scale techniques have understandably raised a high level of concern. Nevertheless, techniques using GPS to track population movement and determine proximity to someone who has been infected were adopted in some countries, such as Singapore and China. Alternative, privacy-sensitive approaches were taken in the US in partnership with companies such as Google and Apple (Ref 8 ). Both of these methods required robust processing of vast amounts of geospatial data from various heterogeneous sources and emphasized the importance of not only the technical challenges of such a task, but also brought up many questions about use of AI in population-scale applications. From more obvious questions regarding the privacy and use of identifying data, to less covered ones, such as marginalized groups that do not have access to technology or have historically been over-surveilled, many issues are still subject to public debate.
Predicting the future of complex dynamic systems, whether it is about the spread of an infectious disease or the demand of a product or movements in the financial markets, is of paramount interest to all modern day organizations. Until the rise of machine learning, some of the most popular approaches relied on model building by experts: a simplified model, with at most a few tens of parameters, that captures the main factors of the dynamic system as conceived by the experts of the field. The relationship between these factors and observables of the forecast is set such that the model reproduces some selected general trends of the available data. This kind of model building process is tightly connected to domain knowledge and expert insight. Once the model is built and its parameters are deduced by fitting to existing data, simulations of different scenarios can be used to make predictions. During COVID-19, many well established epidemiological models were used to forecast the disease spread and results of simulations are used in planning and control by many organizations. Even the public has become more familiar with epidemiological models. In particular, one commonly used model parameter, the reproductive number R0, was brought up in many public discussions about contagiousness of the disease.
On the other hand, modern machine learning models are mostly domain agnostic, complex and can have thousands or even millions of parameters, all serving the purpose to reproduce the existing data and make successful predictions, rather than modeling the dynamic system in an human-interpretable fashion or allowing to draw insights. Thanks to the increased number of parameters and freedom from a format to allow human-interpretation, ML models are significantly more flexible than classical models in describing complex systems, and have proved to be better forecasters for several use cases, e.g. in demand forecasting. (Ref 9)
The disadvantage of ML based forecasting models that became particularly apparent during the pandemic is that ML requires a vast set of data to replace the expert intuition and heuristics to come up with a model that reproduces the real world trend, and it is not robust against stark changes in the data. Both of these aspects of ML forecasting have been an issue in the early pandemic: data was scarce and rapid changes due to public health measures or emergence of variants meant that tomorrow could be very different than yesterday, and ML models that did not provide easily adjustable interpretable parameters had to rely on the shift in data in order to learn the new reality of the world after each major change.
Yet, as the pandemic continues, and data gathering and sharing frameworks are streamlined, the success of ML forecasting can only be expected to improve, and its flexibility can be expected to better represent the real world nuances in pandemic management (e.g. school closures, change of seasons and indoor activities), unlike a simplified, rigid model built on the main factors of the pandemic. Hence it is not surprising that today several ML based models are finding their place in academic journals as well as forecast hubs of organizations such as CDC.
What lies ahead appears to be the best of both worlds: model-embedded (also known as physics-informed) machine learning, in which the form of the classical model is embedded in the mathematical construct of the ML algorithm, often a neural network. There are multiple approaches to this unification of machine and expert wisdom: In some, the simulation results of the manual model is merely used as a guiding constraint while training the ML model, so that the resulting AI agrees with the expert where we know the expert predictions to be reasonably accurate, while still having some flexibility in making its own data driven predictions in different scenarios. In other approaches, model-embedding is more tightly connected, for example, by keeping the expert model shape as is but treating its parameters as ML-learnt functions. One such work has been recently published (Ref 10) and examines how different classical models can be expanded with this approach.
In all these efforts of model-embedding, which is one of the fastest developing future directions of ML, the central goal is to successfully transfer some expert insight and domain knowledge to the domain-agnostic machine, hardwiring how the real world works into the algorithms, at the expense of limiting the machine’s unbound imagination. Afterall, we know this strategy to have worked for nature’s own learning tools, e.g. the neural networks such as the human brain. Despite all its marvelous range and flexibility, human brain structure has been hardwired to function in this world, with this world’s constraints, and not all is learnt through exposure to data and experiences. Hence it only makes sense to bring the same strategy to neural networks that we build to understand the world, whether we are forecasting the spread of an infectious disease or forecasting possible genetic mutations that can occur in a pathogen, or while looking for effective drug molecules that would bind to the correct receptors. All these phenomena occur within the limitations of the real world constraints that our expert models have intuitively accounted for, and embedding them to machines’ learning structure can only increase their success and reduce their reliance on data, which is a much desired requirement when dealing with sudden and novel phenomena such as a pandemic, or any other disruptive process be it the emergence of a new viral marketing tool or a product line or financial crisis.
Above we have summarized the impact of AI on the COVID-19 pandemic and inevitably, the impact of the pandemic on AI, within three main areas relating to the pandemic management and the dominating AI technologies that fueled them: In the area of diagnostics, well established image based technologies were rapidly adapted to the new disease, slowed down only by the availability of data initially. Image-based AI is already used in many innovative ways, from facial recognition applications to self-driving vehicles, and recent developments during COVID reinforce the idea that it will soon become an ubiquitous part of medicine as well.
In the area of pandemic monitoring and surveillance, we have mentioned the acceleration of development of natural language processing (NLP) technologies to answer the call of the pandemic, starting early on with the first detection of the disease. The COVID-19 pandemic gave a significant boost to use of AI in medical texts, as well as in public sentiment analysis and infodemic surveillance. We can expect that as analysis via NLP finds its place in medicine and public health management and as the relevant know-how increases, other recent successes of NLP, such as large scale knowledge graph management and language generation, would also be applied within these fields. Considering the outstanding footprint of text and language in human organizations, it is only a matter of time that these AI technologies find their way into businesses around the world, across languages and industries.
Finally, in the area of forecasting we have overviewed how “classical” methods still emerged as the go-to tools, thanks to their manual adaptability by the domain experts in times of rapid change, and their interpretability that played a significant role in the pandemic collaboration and communication efforts. These desirable qualities of classical tools showed once again that it will be a long time until we can offload creative analytics tasks to machines. It also hinted that the future of ML (and human intellectual work!) may pass from joining forces, as in physics-informed or model-embedded approaches, where human knowledge about the world can be embedded into ML, reducing its reliance on data and increasing its efficiency in solving well established real world problems.
2. https://pubs.rsna.org/doi/10.1148/ryai.2021210011 and https://www.thelancet.com/journals/landig/article/PIIS2589-7500(20)30162-X/fulltext