Project description

Following the refugee crisis that had many people dying on the dangerous journey to Europe, politicians promised to create more legal pathways to the bloc. These legal pathways could come in form of long-term visas. With Germany being one of the prime destinations for asylum-seekers, we evaluated how promising applications from different regions are. We set out to identify whether African applicants were more likely to be rejected for a long-term visa for Germany, compared to applicants from other regions. We were able to prove that this \”perceived truth\” has grounds to it as 22 percent of African applications get rejected, compared to 10 percent of applications from Asia. At the same time, the rejection rates vary widely from country to country. Focusing on Africa, they range from 6 percent in South Africa to 44 percent in Cameroon. This wide variation isn\’t explained in a plausible fashion by authorities. Following our analysis for Africa, we repeated the analysis for Asian countries which revealed a similar pattern.

What makes this project innovative?

Creating data-driven stories with the womanpower of (at the time of producing this story) 2 data journalists for a media company that publishes in 30 different languages and thus caters (at least) 30 different audiences is a challenge that guides our team's work. This story on visa prospects is unique in its transferability: When setting up the analysis for the first story on the chances of African citizens, we designed the code in a way that it could be reused and transferred for different regions, like Asia or the former Soviet Union states. Not only the code was designed to easily create multiple stories, so was the journalistic writing itself. Similar to developing a code template, together with a journalist from our Asia desk, we also developed a "story template" where the core findings, methodology, visuals and expert assessment was written by provided by one journalist and the small regional desks only had to find a protagonist from their region affected by the problem to further personalize the story. This approach made it much more feasible for our small language desks to take on our data-driven reporting. Taken together this story serves as a blueprint for us how to make most of relatively little resources, both programmatically and journalistically.

What was the impact of your project? How did you measure it?

We were able to support the "perceived truth" our reporter had identified, as in fact a visa application from Africa was on average twice as likely to be rejected compared to an application from Asia. The stories were taken on by several DW language services and this one core analysis turned into stories in 14 different languages (English, German, French, Portuguese, Russian, Chinese, Hindi, Bengali, Urdu, Pashto, Dari, Indonesian, Haussa, Amharic), a record among our data-driven stories.

Source and methodology

You can find a full account of our methodology, as well as the data and code behind the analysis, on our GitHub page (see additional links). The original source for the data is the German Federal Foreign Office who published the data in pdfs following a request from the opposition party "The Left" in the German national parliament. The Open Knowlegde Foundation curates all responses to such requests on this website, where DW reporter Daniel Pelz also found the data we needed for this analysis: In our analysis we focused on applications for visa in Germany (for study, work or family reunion for example), thus the below figures do not include visa for short term stays in the Schengen area. The German Federal Foreign Office who - in each of its embassies around the world - collects data on the number of applications filed, accepted, rejected and withdrawn. It calculates the rejection rate based on the number of "processed" applications which includes granted, rejected and withdrawn requests. For our own analysis, we calculated the rejection rate differently: We introduced a new variable "decided applications" that equals the sum of all granted and all rejected applications that were filed in a specific local branch. The listed "share negative (%)" is calculated by dividing the number of rejected applications by the number of total decisions multiplied by 100. Please note that the data refers to applications filed in a specific location/country and do not necessarily translate to the applicant's nationality. We assume that this is the case for the majority of applications, however not all countries have German embassies where visa applications can be made, so that applicants will have to file in another country.

Technologies Used

Abbyy Fine Reader was used to scrape the data out of the pdfs provided by the German Foreign Ministry. The programming language Python (with its libraries Pandas and Matplotlib, Seaborn and Statsmodels) was used to analyze and visualize the data; finalized visuals were adapted to Corporate Design using Adobe Illustrator. To create the interactive data table, we relied on the Javascript Library jquery.

Project members

Gianna-Carina Grün, Daniel Pelz, Ayu Purwaningsih, Shitao Li, Rodion Ebbighausen


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.