Although in Cuba we have the notion that there are no inequalities, at least not notorious, among Cubans, it is evident that, at least, there are differences. We wanted to expose this data as part of an investigation that we were carrying out, but we did not want to decide which groups were formed if we crossed the data of internal migration, skin color, age dependence, average salary and rural population.
Therefore, we chose two different algorithms of unsupervised learning that divided into groups all the municipalities of the country, depending on their proximity to certain centroids. In this way, groups are automatically determined and displayed on a map. This geojson of municipalities in Cuba was created by our team, since there was no usable map that reached that level of depth.
In addition to the map, below are other graphs that help to understand the phenomena, and to offer more information about the characteristics of the identified groups.
To make the narrative more interesting, we decided to create what we call audiotelling. Instead of text, our story is told in the voice of two of our journalists and, at the same time, animations related with the story happens. At any time, the user can stop the audio and interact with the graphics and the map, which are interactive, and after that resume the story.
What makes this project innovative?
The identification of the differences in Cuba is always tentative but it's difficult to acomplished withoud any bias. For this reason we thinked to use Artificial Intelligence algorithms that, only based, on the data we provided the y could identify groups of municipalities with common characteristics and, in this way, shows the differences in the geographical space of Cuba. For this reason we decided to use two unsupervised learning algorithms to identify the groups of municipalities. The work was conceived as a data app where everyone coud find out the different groups base on the information they want to cross. So, It's a tool for readers but also for researchers interested in this topics. The tool is not biased by the criteria of the researchers, it uses clustering algorithms to calculate the groups and use the centroids to identify these. Crossing two criteria of the five that we choose allows to show very different groups or very similar to each other, which denotes that these criteria influence the lives of people, and so does their geographical situation. We don't wanted to present the work only as a data app, we also wanted to tell our insides about the interesting facts we find out. So we try to create a narrative that could ilustrate how you can use the tool and also tell the story we one but without being intrusive. Instead of using traditional narrative, we did what we call audio-telling: we create a podcast in our voices and to the extent that our considerations are heard, the map shows what we indicated. But at the same time that the story is told the reader can interact with the tool. This is an interesting way to count the data, as it helps readers visually focus on the map. This is a powerful tool that allows other researchers and journalists to access official data, but also to cross and compare them, in order to determine social phenomena. In addition, the map can be consulted by decision-makers, and this will help to clarify some common points between minorities and poverty.
What was the impact of your project? How did you measure it?
This is the first tool that allows analyzing at a local level the differences in Cuba in terms of social criteria, but also economic ones. In that sense, it goes beyond journalism to become a tool of use for those working on minority issues. With the data shows the gap that exists between people in the city and people in the field, or people with different skin color. This work in a few days was very visited. It was quickly shared on social networks and was recommended by groups that support data journalism, but also by academics of hyperlocal issues, as well as by organizations that investigate issues of race, gender and youth. Many people an institutions were impressed for the use of audio and animations in a interactive way and also for applying clustering algorithm to detect socieconomic diffrences without the research bias. In just two days this work he exceeded, in his behavior on social networks, all the articles that we have published in Postdata.club history.
Source and methodology
Saimi Reyes Yudivián Almeida Ernesto Guerra