Project description

The project is about a web portal in which we public information about the work of salvadorean congressmen. We published information about how every deputy voted in the last two years, how many times they traveled and how many laws they impulsed. In the website we give not just a qualitative perspective but a qualitative one telling people what kind of laws they’re representatives are presenting. The main goal of the initiative was to provide a hub of information about every salvadorean delegate in congress, so people can choose which deserves to be voted for reelection. The audience where adults between 18 and 39 years, with higly interest in politics.

What makes this project innovative?

The project is innovative because the totality of the information was obtained and processed using computer algorithms as a result of the data were not in processable format. As a conclusion of the project, all the information collected was presented as open data to the public. Also, is innovative because through the data we can make an analysis about the quantitative and qualitative performance of each lawmaker, especially those that seek re-election.

What was the impact of your project? How did you measure it?

The impact of the project was mainly to encourage society to comment on the subject of legislative work and the great economic cost it generates for the few results obtained. During the project, the observatory obtained a large increase in the metrics of both scope and engagement, in the same way, in social networks there was ample debate about the need to limit travel expenses, as well as to identify the pertinence of the laws promoted.

Source and methodology

The web portal of the Legislative Assembly of El Salvador was used as our main source of documents. Specifically, we downloaded many PDF files composed of scanned images in which information was contained. The voting data and approved bills, travel and assistance to the different plenary sessions where taken. Quality filters were applied to the processed information, especially through a selection of statistically calculated samples to determine the efficiency of the processing algorithms, having passed all the tests the goal of error margins of less than 2%.

Technologies Used

Mainly the Python programming language and multiple libraries of that language were used to collect, process, systematize and analyze the information. First, the Request Python library was used to download all PDF documents. Second, using Tesseract-OCR images where converted in processable text, which was then conveniently stored into categories created by clustering algorithms. Finally, we used scikit Python library to properly analyze the data collected.

Project members

Lilian Angélica Martínez Martínez


Additional links

Project owner administration

Contributor username


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.