Project description

This interactive visual report shows how the combination of several types of data can be leveraged to provide a more systematic understanding of the efforts made to save migrants’ lives in the Central Mediterranean sea. It is the result of a collaboration between UN Global Pulse and the United Nations High Commissioner for Refugees (UNHCR).

The analysis uses automatic identification system, broadcast warnings, and social media data to uncover typical rescue patterns in the Mediterranean sea. The different data are combined to construct narratives of individual rescues, which are in turn used to explore the possibility of automated rescue activity recognition using machine learning and artificial intelligence.

Rescue patterns highlight typical scenarios in which e.g., one vessel conducts multiple rescues over time until it reaches full capacity, or multiple vessels collaborate on a single rescue. The temporal evolution of distress signals shows that migrants tend to request assistance ever closer to Libyan shores, forcing rescue vessels to venture further and further beyond their initial search and rescue zones.

This work shows the magnitude of the humanitarian crisis in the Central Mediterranean sea from a new angle; and its results can be used to optimize rescue operations, and can serve as a basis for measuring the possible economic impact of legally required rescue interventions by commercial ships navigating in the Mediterranean sea. It was initially conducted for UNHCR advocacy purposes, but has proved to be useful for illustrating how the study of new data sources can be relevant to global humanitarian efforts.

What makes this project innovative?

This work uses an original combination of three readily available datasets: AIS data, which can reveal patterns of search-and-rescue operations; broadcast warnings, which disclose maritime alerts, issued on behalf of migrant vessels in distress; and social media posts and online articles from known rescue organizations that highlight specific rescue events.
The quantitative insights gained from this combination add to the large amount of qualitative, descriptive coverage produced by NGOs and news outlets in two ways. First, they capture the routine, “successful” rescues that occur almost every day, and often go unreported. Second, they provide an overview of rescue operations, which is critical for advocacy and coordination; they show the true magnitude of what is happening, which can help stakeholders quantify costs, shortcomings, and plan for future needs.

More generally, an important novelty in this work is that it focuses not only on dead and missing migrants (which is the typical angle assumed by quantitative analyses of the Central Mediterranean migration crisis), but also on a broader perspective that includes avoided deaths, highlighting the success of the humanitarian community’s effort to recover people at sea.

What was the impact of your project? How did you measure it?

The work presented in this interactive visual report was published (in written form) in the International Organization for Migration's (IOM) Fatal Journeys Volume 3 report:

The report itself has been used to present the value of big data for humanitarian action to various government officials and United Nations agencies and departments. Perhaps most noticeably, it was presented during the 2018 Digital Diplomacy Camp in the Hague to a crowd of Ambassadors and Ministry of Foreign Affairs officials from roughly 30 different countries.

It was also long-listed at the Kantar Information Is Beautiful Awards (

Finally, a video introduction to the work was awarded a honorable mention in the PacificVis storytelling contest ( /

Source and methodology

The primary building block of the analysis is the quantified rescue, or rescue signature: a concise summary of a ship’s behavior in an identified rescue sequence. Signatures help unify narrative threads from a variety of sources, tying the observable physical traces of a rescue operation to qualitative sources of information on how events unfolded.

Rescue signatures are primarily based on Automatic Identification System (AIS) data. AIS is a communications system used by maritime authorities and navigating ships to locate nearby vessels and avoid collisions. AIS data include both static information—including identifiers, vessel type, and flag under which they sail—and dynamic information—including latitude and longitude, speed, course over ground, and destination. AIS data are available from a range of different providers including ExactEarth, OrbComm, and MarineTraffic.

The narrative threads are then re-constructed using mainly social media posts and online articles, and broadcast warnings. These latter data are produced by the World-Wide Navigational Warnings Service (WWNWS) to warn ships of potential safety hazards in a region, and of nearby emergencies—invoking a responsibility to respond according to the 1974 International Convention for the Safety of Life at Sea (SOLAS). The warnings contain a wide variety of information—including alerts on vessels adrift, cable laying operations, and debris in the water. Broadcast warnings are available at

Manually characterizing the behavior of four rescue ships in the Central Mediterranean over a 100-day period requires labelling over 77,000 AIS data points; and in a mere two-week sample of all AIS data in the region, there are traces from almost 10,000 unique ships. Therefore, this work also explores novel methods for scaling the detection of rescue signatures using machine learning and artificial intelligence.

The problem is essentially a classification one: AIS data points can be characterized according to particular features that are likely to indicate a rescue activity. Points are first clustered into similar groups along a ship’s trajectory, and clusters are then categorized as rescue or non-rescue sequences. At this stage, it seems simply speed and course over ground have some power for predicting whether a cluster of AIS data points are related to a rescue sequence or not.

Technologies Used

The raw data for the study were compiled and processed in Python using libraries including: Numpy, Pandas, GeoPandas, Shapely, Tweepy, and BeautifulSoup. AIS data points were manually labeled with "ground truth" values using visual inspection in QGIS.

Automatic classification was conducted using the Python library SciKitLearn. In particular, three binary classification algorithms were tested: AdaBoost, Support Vector Machines, and Logistic Regression.

The data were then cleaned and converted into a JSON format, and the interactive visual report was finally designed using Javascript libraries including: Pixi.js, d3.js, topojson, and sound.js.

Project members

Katherine Hoffmann, Felicia Vacarelu, Amanda Zerbe, Miguel Luengo-Oroz, James Leon-Dufour, Duncan Breen, and Christopher Earney.



Additional links

Project owner administration

Contributor username


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.