University of São Paulo: Data from Enem can contribute to the construction of future educational policies, according to a study

What can data on the performance of 1.3 million young people in the National Secondary Education Examination (Enem) reveal about Brazilian education? A study considering the performance of these students in each edition of the test, over six consecutive years, shows how data science can be an important ally of the government in the construction of successful future educational policies.

Conducted by researchers from the Institute of Mathematics and Computer Sciences (ICMC) of USP, in São Carlos, the research evaluated results obtained by students who took the Exam between the years 2012 and 2017, based on information provided by the National Institute of Studies and Educational Research Anísio Teixeira (Inep). Using computational techniques, the scientists performed three types of analysis: the first on student performance in each region of Brazil; the second comparing the performance of students from public and private schools; and the third sought to understand the participation of students with disabilities.

With the help of some algorithms (sequences of commands that are passed to a computer to specify a task) developed by the researchers themselves, the experts calculated the average scores for each of the four knowledge areas of the test (Languages ​​and Codes, Sciences Nature, Human Sciences and Mathematics) and writing. After that, the students were divided into three groups: those who had “low performance”, “medium performance” and “high performance”.

The process made it possible to draw comparisons between the performance and evolution of students in each region of Brazil, which is essential, given the significant cultural and economic differences between the locations. Scientists were especially surprised when they investigated the increase in the average score of the high-performing group between 2014 and 2017: in the first years of this period, the Midwest and Southeast regions achieved the best scores; but, year after year, the Northeast improved and, in 2017, it became one of the best performing regions alongside the Midwest.

“In this high-performance group, we noticed that public schools in the Northeast grew over the years until they reached second place, which previously belonged to the Midwest region. This indicates that, probably, those who manage public schools in the Northeast started to invest more in preparing for the Enem, adopting new methods that gave rise to more satisfactory results. It is a fact that indicates the need for further in-depth research to try to understand what has been done in this region and, eventually, to replicate successful practices in other places”, explains ICMC master’s student, Afonso Matheus Sousa Lima.

Another topic evaluated by the scientists was the participation rate of students by type of school. Between 2012 and 2017, the low and medium performance groups were mostly made up of students from public schools. On the other hand, for the high-performing group, there was a balance between school types. An example is that of the North region, which presented the highest participation rate of public school students every year.

Analyzing the information related to the two types of schools, another fact intrigued scientists: the decrease in the participation of students with disabilities from public schools. “In general, students with disabilities have left public schools and gone to private schools. This could indicate that the private school is adopting some very good strategy or that the public school is getting worse. The fact is that private schools are becoming more attractive to these students”, comments Robson Leonardo Ferreira Cordeiro, a professor at the ICMC and one of the authors of the research.

The reduced participation in Enem of students with disabilities from public schools was so pronounced in 2012 that researchers had to disregard the data collected that year. In addition, the number of students with this profile in the following years was also very small, which is worrying, since, in 2017, for example, almost 24% of the Brazilian population had some type of disability, according to the United Nations Organization for Education, Science and Culture (Unesco).

Initially, the ICMC group of researchers had decided to use data from 2009 onwards in the work, because that year the Enem was reformulated to unify the entrance exams of Brazilian federal universities. However, data related to a few years, such as 2011, was disorganized. As early as 2018, the information on the students’ school type was omitted by most students.

Because of these limitations, the scientists chose to study the period from 2012 to 2017, and calculated, for each of these six years, what they called the “attributes” of the students, ranging from the responses to the Enem socioeconomic questionnaire to the grades four areas of knowledge and writing. As all this information was essential for the investigation, data from students who did not respond, for example, if they had a disability or who had zeroed in any of the competences, were discarded. Thus, although more than five million students participated in Enem in each of the years analyzed, the filtering resulted in a sample of approximately 1.3 million young people per year.

In general, specialists agree that the Access to Information Law (LAI) was essential for carrying out the work, but they emphasize that it is necessary to improve the quality, organization and standardization of data, so that they become potentially more useful to the analyses, as well as for the construction of new public policies aimed at qualifying education in the most deficient regions. In the future, the ICMC group intends to research student performance between the years 2018 to 2022, which will allow us to investigate the impacts of the pandemic and the most recent change in Government on student performance at Enem. According to Inep, in 2021, for example, only four million students signed up for the edition, the lowest number of participants in 13 years.

The study is the result of a postgraduate discipline called “Topics in Databases” and generated an article published in the scientific journal Journal of Information and Data Management , authored by researchers Afonso Matheus Sousa Lima, Alexander Ylnner Choquenaira Florez, Alexis Iván Aspauza Lescano, João Victor de Oliveira Novaes and Natalia de Fatima Martins. The work also had the participation and guidance of professors Robson Leonardo Ferreira Cordeiro, Caetano Traina Junior, Elaine Parros Machado de Sousa and José Fernando Rodrigues Junior, all from ICMC. They are part of the Institute’s Database and Image Group (GBdI).