2020 has been a year unlike any other—from the initial outbreak of COVID-19, to the effects of the pandemic sweeping across the world, to record-breaking wildfires blazing across the western U.S.—major unanticipated large-scale events have defined 2020. As those in the field of event-detection and AI know, large-scale unexpected events are challenging even for the most advanced AI systems. Delivering the world’s leading event and information discovery capability to Dataminr’s clients during COVID-19 has served as a great example of how to address this challenge. During the pandemic, we have leveraged advanced AI approaches to rapidly adapt to the unprecedented shifting global information landscape, providing a case study of how to effectively navigate large-scale unexpected event cascades in real-time.
Dataminr delivered its first COVID-19 alerts to customers on Dec. 30 based on on-the-ground eyewitness public social media posts from Wuhan, China. The very first alert, delivered at 9:11 a.m. on Dec. 30, provided the earliest digital warning, 7 days in advance of the first U.S. government announcement. Dataminr’s initial alert on COVID-19 was created purely by Artificial Intelligence. This serves as an example of our AI Platform’s effective use of our 10-year data archive of all events that have occurred in the world and how those events were recorded in public digital data. In this case, our system was able to detect this initial outbreak because to our AI models, the initial pattern resembled that of other previous outbreaks, such as the Ebola outbreak in West Africa, the Zika outbreak in South America, and the H1N1 outbreak in Southeast Asia. Entities, past events, and relationships represented in our knowledge graph played a critical role in the identification of these early alerts about how “Another SARS is spreading quietly in Hankou Wuhan” (as it was initially described in Chinese from an on-the-ground eyewitness in Wuhan).
However, in spite of our AI system’s ability to perform well at detecting the initial indications of COVID-19, we soon faced the fundamental challenge of tracking an outbreak at unprecedented speed and scale—one with global impact and meaning far beyond anything we’ve recorded in the past. When the historical data used to train detection models diverges most from the state of affairs of the present day, AI systems can often struggle to interpret what they are seeing.
Since Dataminr’s earliest alert on COVID-19 on Dec. 30, 2019, we’ve since detected and alerted on millions of events across the globe, many of which can be traced back directly or indirectly to the cascade of events caused by the current pandemic. These events have been wide ranging—from product shortages, to pockets of outbreaks, to government policy changes. There were even signs of changes in the earth’s seismic noise and air quality. Many events that have occurred since Dec. 30, 2019, take on meaning not just because of the fact they happen, but because they happen in the context of the pandemic.
At Dataminr, we’ve long been doing research on modeling and characterizing event cascades, particularly in large-scale emergencies, and deploying AI methods that allow our AI platform to adapt most effectively. For example, in a paper we published in 2019 (Workshop on AI for Social Good at the 36th International Conference on Machine Learning, ICML), we studied the automatic detection of hypotheses for sub-event cascades. While at the time, we focused on wildfires, the methodology generalizes well to many other geographically dispersed, large-scale cascades of related events. The relevance of such efforts has been highlighted during one of the worst fire seasons on the west coast of the United States. And, in today’s context, the wildfires and many of the events they have caused (ranging from air quality issues to evacuations) are part of a large cascade of events that merge with the COVID-19 pandemic cascade.
When Dataminr’s AI platform detects events and delivers the right alerts to our clients, it has to deal with ingesting thousands of “noisy” data points per second: processing and identifying relevant information from different sources in different formats with varying levels of detail in real time, so that the right end-users can be alerted in the right way as the event happens. Today, Dataminr’s AI platform ingests and processes 75,000+ distinct public data sources that include text in different formats and languages, image, video, sound and machine-generated data streams.
One of the technologies our AI platform uses to identify relevant events is a Knowledge Graph (KG). A Knowledge Graph, in its simplest form, is a collection of things and relationships—“nodes” and “edges.” The use of Knowledge Graphs is on the rise—from companies such as Google, Microsoft, Airbnb, and Netflix, to agencies such as the NSF, and NIH. In part, the resurgence of Knowledge Graphs is meant to complement advances in and address the limitations of Deep Learning Neural Networks. Knowledge Graphs can be applied in any industry, from healthcare, to finance, to applications as far reaching as disease tracking and crisis response. Deep Learning Neural Networks rely on large quantities of labeled data. At Dataminr, we apply numerous Deep Learning Neural Networks on the large historical archive of events we have accumulated over the course of our ten-year history. For Knowledge Graphs, a new paradigm of knowledge-infused mining and learning accounts for pieces of knowledge that accrue from domain expertise and guidance from models from physics and social science, among others, allowing us to expand to promising techniques in Graph Neural Networks (GNNs).
Graph Neural Networks are Deep Learning based methods that operate on the graph domain. The basic idea is that in GNNs you have graphs that represent knowledge—nodes and edges, on which you can perform tasks such as node classification, link prediction, and clustering. With Knowledge Graph structures, it’s also possible to “reason” about possible outcomes and find relationships between elements in the graph. For example, techniques that use KGs and GNNs help us identify how events were connected (link prediction), detect new events (node classification), and clusters—in the context of the new world and information landscape during COVID-19.
Another AI approach that Dataminr is also taking significant advantage of during COVID-19 is our proprietary online human-AI feedback loop that reacts to events that don’t follow historical patterns. Dataminr’s highly-talented team of Domain Experts operating in a realtime human-AI feedback loop constantly label a set of real-time events as they unfold. Our team members with domain expertise across a variety of backgrounds assign labels and look for breakout events. Our AI systems dynamically learn each time a new label is assigned by a Domain Expert, affecting the numerous event detection algorithms that run fully autonomously. Through this process, human knowledge is captured into our platform in ways that include a Knowledge Graph, and our AI models are continuously adapting and learning.
Our success in capturing real-time events across the globe during a global pandemic demonstrates the power of combining the latest AI and technical research with a human-AI feedback loop. Today, 97% of all alerts that Dataminr sends to all customers are fully created by Dataminr’s AI platform without any Domain Expert or human involvement. However, the learning loop driven by the alerts where Domain Experts play a critical role is an essential factor that enables Dataminr’s systems to continuously learn and detect events across the globe. The benefit of Dataminr’s dynamic human-AI feedback loop is even more pronounced in situations like a COVID-19 event cascade where the pace of change has been unparalleled.
Want to work on projects that save lives and that make a highly positive impact in the world? Our AI and Engineering team at Dataminr is rapidly expanding! Browse our job posts here to see if a role is right for you.
Jason Wilcox is Dataminr’s Executive Vice President of Product and Engineering. Before joining Dataminr, Jason held senior leadership roles at Microsoft.