PhD Defense (PDMD): ”Food Wide Web: a digital food and media literacy program addressed to adolescents”

Candidate:
Adriana Aguiar Aparício Fogel

Date, Time and Location:
October 20 2025, 14:30, Sala de Atos da Faculdade de Engenharia da Universidade do Porto

President of the Jury:

António Fernando Vasconcelos Cunha Castro Coelho (PhD), Associate Professor with Habilitation, Faculdade de Engenharia da Universidade do Porto.

Members:

Joana Alves Dias Martins de Sousa Ferreira (PhD), Assistant Professor, Faculdade de Medicina da Faculdade de Lisboa;
Ivone Marília Carinhas Ferreira (PhD), Assistant Professor, Department of Communication Sciences, Faculdade de Ciências Sociais e Humanas da Universidade Nova de Lisboa;
Sara de Jesus Gomes Pereira (PhD), Associate Professor, Department of Communication Sciences, Instituto de Ciências Sociais da Universidade do Minho;
Ana Filipa Pereira Oliveira (PhD), Assistant Professor, Faculdade de Comunicação, Arquitetura, Artes e Tecnologias da Informação da Universidade Lusófona;
José Manuel Pereira Azevedo (PhD), Full Professor, Department of Communication and Information Sciences, Faculdade de Letras da Universidade do Porto (Supervisor);
Ricardo José Pinheiro Fernandes Morais (PhD), Assistant Professor, Department of Communication and Information Sciences, Faculdade de Letras da Universidade do Porto.

Abstract:

The current complex and saturated media environment has given rise to an “infodemic” — an excess of information, both accurate and misleading, with potential impacts on the health of populations.
In the field of nutrition, the widespread dissemination of biased or incorrect content can contribute to unhealthy eating behaviors and may help explain the high global prevalence of obesity. Adolescents are particularly susceptible to this phenomenon because their self-regulation processes are not fully developed and because they are more influenced by external stimuli during this phase. This context reinforces the importance of promoting integrated food and media literacy among young people, providing them with tools that allow them to critically interpret, question, and consciously deal with the influences of food marketing and misinformation about nutrition. This study was developed in this context and had three main objectives: (i) to develop and implement a school-based intervention program using an intertwined perspective between media and food literacy issues; (ii) to evaluate the effectiveness of this intervention on the levels of media and food literacy of adolescents; (iii) to contribute to characterizing the media and food literacy levels of teenagers in Portuguese schools. The intervention consisted of ten 45-minute sessions, addressing eight dimensions of the food system — production, processing, distribution, planning and management, selection, preparation and cooking, intake, and disposal — through the lens of core media literacy competencies: access, analysis, evaluation, and creation. The contents included media materials that encouraged reflection and debate on the global food system. The program was implemented between October 2022 and May 2023 in four schools in northern Portugal — two were part of the intervention group and two were part of the control group. The final sample consisted of 202 students between the ages of 13 and 16 (M = 13.6; SD = 0.75). Data was collected through a questionnaire covering five main thematic areas: (a) exposure to food advertising, (b) satisfaction with body weight, (c) opinions, attitudes, and knowledge about media and food, (d) dietary practices, and (e) literacy related to food and media content. The questionnaire, constructed from pre-existing instruments, included open-ended and closed-ended questions and was administered to both groups before and after the sessions. In the intervention group, the creative projects developed in the classroom were also analyzed. Quantitative data were statistically evaluated, and qualitative data were subjected to a hybrid thematic analysis (inductive/deductive), followed by content analysis. After the initial qualitative analysis, a scoring system was developed that assigned numerical values to the responses. In line with the project objectives, healthy and sustainable choices, as well as critical evaluations and creations that encouraged participation, were valued. This scoring system included both closed-ended and task-based questions, allowing for a comprehensive and quantifiable assessment of the impact of the intervention on students’ food and media literacy, as well as their associated behaviors. The Likert section, consisting of 15 questions on attitudes, opinions, and knowledge, was scored from 0 to 4 per item, with a maximum possible score of 60 points. The food consumption section was converted to a weekly pattern and included a dietary adequacy index, with positive scores attributed to healthy behaviors (e.g. consumption of fruit and vegetables) and negative scores to unhealthy behaviors (e.g. consumption of fast food), with an initial score between -15 and 38, later transformed into a scale starting at 0, to facilitate interpretation. Finally, the section on food media literacy assessed the understanding of food labels (0 to 6 possible points, based on correct answers) and advertising literacy (score up to 14 points), including critical analysis of advertisements (one printed and one video) and an open-ended creative activity. The responses were analyzed based on their complexity, considering the ability to interpret marketing strategies and express ideas critically and creatively. The conversion of qualitative data into numerical scales allowed statistical comparisons between moments (pre vs. post) and groups (control vs. intervention; male vs. female). The results demonstrated that the intervention developed was feasible and effective. Significant improvements were observed in the students’ advertising literacy (1.5 vs. 1.9; p = 0.009) and in their ability to interpret food labels (2.0 vs. 2.2; p = 0.039). Among the girls in the intervention group, a significant improvement was observed in the total scores regarding opinions, attitudes, and knowledge about media and food (36.8 vs 38.1; p = 0.037). Concerning body satisfaction, significant differences between the girls in the intervention group and those in the control group at the preintervention moment became insignificant after the intervention (p = 0.015 vs. p = 0.402). The same occurred with the differences between the girls and boys in the intervention group, which were significant only before the program (p = 0.010 vs. p = 0.412). These data denote improvements in satisfaction with body image, particularly among the female participants, who reported a more balanced and healthy relationship with their bodies and eating habits after participating in the program. Regarding eating patterns, the male participants also showed improvements, but in specific habits, with an increase in the consumption of cereals and tubers standing out (6.2 vs. 8.2; p = 0.032). However, a persistent concern related to body weight was identified: 43.5% of the girls expressed a desire to change their weight, although only 28.3% considered themselves to be outside the weight they would consider normal. Among the boys, 76.1% declared themselves to be of normal weight, but 35.8% reported the desire to change their weight, even after taking part in the intervention. In addition, gaps in knowledge about the Mediterranean dietary pattern were found. Considering the entire sample, the students revealed difficulties in responding adequately to questions related to this topic, reporting only moderate levels of adherence to the aforementioned dietary pattern. In this aspect, they obtained a score of 30.6 (SD = 7.4), out of a maximum of 53. This is an important aspect in the characterization of adolescents, as the Mediterranean diet is the basis of Portugal’s national dietary guide, known as the Food Wheel. The adolescents also reported habitual exposure to advertisements for foods rich in sugar, salt and fat, despite the existing regulatory measures. Only 6.7% stated that they had not seen advertising for these products in the 30 days prior to the survey. In conclusion, this thesis proposes an innovative framework that integrates food and media literacy. Supported by empirical evidence, it includes a well-organized lesson plan and detailed assessment tools, constituting a practical resource for educators in general. The support resources used in the sessions are potentially adaptable to different educational and geographical contexts. The results contribute to the growing body of evidence supporting comprehensive educational interventions and reinforce the importance of integrating food and media literacy into school curricula as a strategy to promote critical thinking and informed food choices. Finally, the data suggest that a collaborative effort is essential to prepare adolescents to navigate an increasingly complex food environment, promoting healthier and more conscious choices. In this sense, collaboration between political decision-makers, education professionals, and stakeholders from the sectors involved (advertisers, advertising agencies, media outlets, social media platforms) is essential. The actions taken today have a substantial impact on the health and well-being of this and future generations.

Keywords: media literacy; food literacy; digital media; school-based intervention; adolescents.

DEI Talks | “Declarative Programming” by Steven Pemberton (ACM Distinguished Speaker)

The talk “Declarative Programming” will be delivered by Steven Pemberton, a renowned researcher in Computer Science and Information Technology and an ACM Distinguished Speaker, on October 23rd at 10:00, in room B033, and will be moderated by Prof. João Ferreira (DEI). Admission is free.

Abstract:

“In the 50s, when the first programming languages were designed, computers cost millions, and relatively, programmers were almost free. Those programming languages therefore reflected that relationship: it didn’t matter if it took a long time to program, as long as the resulting program ran as fast as possible.
Now, that relationship has been reversed, which I call Moore’s Switch: compare to the cost of programmers, computers are almost free.
And yet we are still programming in descendants of the programming languages from the 50s: we are still telling the computers step by step how to solve the program.
Declarative programming is a new approach to applications: rather than describing exactly how to reach the solution, it describe what the solution should look like, and leaves more of the administrative parts of the program to the computer.
One of the few declarative languages available is XForms, an XML-based language that despite what its name might suggest is not only about form. Large projects, at large companies such as the National Health Service, the BBC and Xerox, have shown that by using XFoms, programming time and cost of application can be reduced to a tenth and sometimes even much more.”

About the Speaker:

Steven Pemberton is a distinguished researcher in the field of computer science and information technology, with a long and rich history of contributions to the development of the internet and the web. He is affiliated with the Dutch national research centre Centrum Wiskunde & Informatica (CWI) in Amsterdam, The Netherlands, where he conducts research on interaction, declarative programming, and web technologies.
At university he was tutored by Dick Grimsdale who built the world’s first transistorised computer, and who was himself a tutee of Alan Turing. After university, Pemberton — coincidentally — worked in Turing’s old department in Manchester, writing software for the 5th computer in the line of computers Turing had worked on.
Pemberton was the first user of the open internet in Europe when the CWI created the first connection in 1988, and has been involved with the web from its inception, co-designing several web standards, including HTML, CSS, XHTML, XForms, and RDFa. He chairs two groups at W3C.
In addition to his work on the web, Pemberton has also made significant contributions to other areas of computer science, such as the design of programming languages, having co-designed the language that Python is based on, and the study of human-computer interaction. His involvement with ACM includes being editor in chief of The SIGCHI Bulletin, and then ACM interactions for a decade; he has chaired the CHI Conference and he co-founded the Netherlands local SIGCHI group, and chaired several local CHI conferences there.
He has received numerous awards and recognitions for his work, including the ACM SIGCHI Lifetime Service Award and the ACM SIGCHI Lifetime Practice Award.
As a speaker, Pemberton is known for his engaging and informative presentations, which draw on his deep knowledge of computer science and his passion for technology, and cover both social and technological aspects of computing. His talks are always thought-provoking and entertaining, and he has been invited to speak at numerous conferences and events around the world. In 2023 he became an ACM Distinguished Speaker. He is bi-lingual in English and Dutch.
A fuller bio, videos, and a full list of talks is available on his website: https://www.cwi.nl/~steven”

António Coelho visits The Arctic University of Norway to promote immersive teaching and pedagogical innovation

Prof. António Coelho, a lecturer in the Department of Informatics Engineering (DEI) at the Faculty of Engineering of the University of Porto (FEUP), was recently in Tromsø, at the Arctic University of Norway (UiT), to strengthen ties with European partners and explore new teaching methodologies under the EUGLOH (European University Alliance for Global Health) university alliance.

Context and objectives of the visit

The visit, funded by the Erasmus+ programme, was motivated by the need to renew teaching practices, promoting more immersive and collaborative learning environments. As part of this university alliance, António Coelho leads the development of courses at the University of Porto that use virtual reality, simulations and educational digital games as central tools in the teaching-learning process. One of the key concepts it has been implementing are the so-called “Living Labs” – workshops, hackathons and courses run by interdisciplinary teams of students and teachers, with a strong component of co-creation, digital innovation and community services.

Main ideas defended

Safe learning environments to fail: in games you are allowed to fail and try again, something that António Coelho sees as essential in education. This type of environment encourages students to explore, experiment and learn through error, without fear of failure.

Virtual reality and simulations: make it possible to create a common virtual classroom, regardless of geographical location, where students and teachers can explore scenarios, make decisions and observe consequences in real time.

Interdisciplinary collaboration: for the professor, bringing together students with different backgrounds (e.g. computer science, health, design, music, arts and humanities) boosts creativity, innovation and aesthetic qualities in their work, while strengthening essential skills in the labour market, such as leadership, communication and teamwork.

Virtual internationalisation: in addition to physical mobility, it points out that virtual mobility through immersive environments can significantly increase the presence and quality of international interaction, overcoming the limitations that traditional communication platforms have.

Partnerships and concrete projects

During the visit, António Coelho spoke about the collaboration with EUGLOH, which is promoting various courses based on the Living Lab model, with the participation of the University of Porto and other European institutions, including UiT. Some of the courses mentioned:
“Serious Games as a global health education tool”, starting in autumn 2025, at partner universities such as Ludwig-Maximilians-Universität München and Szeged.
“Putting the students first”, a course that will be taught online and face-to-face, involving multiple institutions, with a focus on “learning how to learn” and on students’ well-being and personal development, involving Paris-Saclay and UiT.

The initiative led by Prof. António Coelho reinforces a vision of modern teaching, open to risk and experimentation – teaching in which failure is part of the learning process. “The visit to Norway’s University of the Arctic was more than an institutional exchange: it was a concrete step towards transforming how we teach, how we learn and how we co-operate in international contexts. It is hoped that we will soon be able to see these ideas applied to FEUP projects as well, with direct benefits for students and teachers,” says the Professor in his reflection on the recent mission.

Miguel Abreu, ProDEI alumnus, wins APPIA’s Best Doctoral Thesis in Artificial Intelligence 2024

By Nuno Teixeira, SICC, FEUP

“The Portuguese Association for Artificial Intelligence (APPIA) award for the Best Doctoral Thesis in Artificial Intelligence of 2024 was awarded to Miguel Abreu, a former student of the Doctoral Programme in Informatics Engineering (PRODEI) at the Faculty of Engineering of the University of Porto (FEUP).

The award-winning thesis, entitled “Symmetry, hierarchical structures and shallow neural networks: Advancing reinforcement learning for humanoids” and developed under the supervision of Luís Paulo Reis (FEUP) and Nuno Lau (University of Aveiro), represents a significant advance in the application of reinforcement learning to humanoid robots, exploiting symmetry, hierarchical structures and shallow neural networks to increase the efficiency and robustness of complex robotic systems.

Miguel Abreu’s contribution was fundamental to the successes of the FC Portugal team, four-time world champions in the RoboCup 3D Humanoid Robot Simulation league (2022, 2023, 2024 and 2025), where he actively participated in the development of advanced algorithms for control, planning and co-operation between robots.

Miguel Abreu currently works at the DLR (German Aerospace Centre) in Munich, where he continues his research into intelligent robotics and autonomous systems.

The award ceremony for the Best PhD Thesis in Artificial Intelligence 2024 took place on 2 October, during the dinner of the 24th edition of EPIA – Portuguese Meeting of Artificial Intelligence, held in Faro.

The prize includes an official certificate and a symbolic monetary value of 1000 euros.

Created in 2007, the APPIA Prize for the Best Thesis was initially awarded biennially, but since 2022 it has become annual, with the aim of highlighting the scientific merit of excellent research in the field of Artificial Intelligence in Portugal.”

DEI Talks | “Software process modeling and test automation: Introducing the Reliable Software Architectures Research Group” by Prof. Přemek Brada

The talk “Software process modeling and test automation: Introducing the Reliable Software Architectures Research Group” will be presented October the 9th, at 15:30, room B031, and will be moderated by Prof. Ana Paiva (DEI).

Abstract:

“In this talk, I will give an overview of research done by the Reliable Software Architectures Research Group at the University of West Bohemia in Pilsen, Czechia. The focus will be on analysing software process data to detect project management (anti-)patterns, where we’ll discuss the challenges in modeling software process elements in a way that is conducive to mapping onto the information gathered in project management tools. We’ll also touch the topic of analyzing software implementations to perform advanced verification and testing.”

About the Speaker:

Přemek Brada is an Associate Professor in Software Engineering at the Department of Computer Science and Engineering, University of West Bohemia in Pilsen, Czechia.  His research has covered the areas of software architecture consistency, interactive methods of architecture visualization, and software development methodologies including analysis of related process data.  He teaches bachelor and master level courses on object-oriented design and modeling, advanced software engineering practices, and also knowledge management. Currently he serves as the head of department, and is a member of the Board of Informatics Europe, the association of European informatics faculties and departments.

GNU Tools Cauldron 2025 brought together international experts at FEUP

From 26 to 28 September, the Faculty of Engineering of the University of Porto (FEUP) hosted the 14th edition of the GNU Tools Cauldron, a world reference technical conference dedicated to the GNU Toolchain and associated open source development tools.

This international meeting was held for the first time in Portugal and brought together around 140 participants from more than a dozen countries, including Canada, Germany, the Czech Republic, the United Kingdom, Ireland, Portugal, the Netherlands, France, India, the United States, Belgium, China, South Africa and Brazil.

An event with history and global impact

Created in 2012, the GNU Tools Cauldron has been organised annually, passing through some of the world’s most prestigious universities, such as the University of Cambridge (UK), Charles University (Czech Republic) and the University of Manchester (UK), and now arriving at the University of Porto. Throughout its history, the event has taken place in cities such as Mountain View, Prague, Cambridge, Manchester, Hebden Bridge, Montreal and Porto. The aim of organising the conference in partnership with higher education institutions is to strengthen the link between the international open source development community and academia, promoting the direct involvement of students and researchers.

This technical conference focuses on the GNU Toolchain – which includes fundamental tools such as gcc and gdb, and utilities and libraries such as binutils and glibc – and associated projects (ltrace, poke, systemtap, valgrind, among others). It is a critical ecosystem for most of the reference Linux distributions (AlmaLinux, CentOS Stream, Debian, Fedora, Gentoo, RHEL, Rocky Linux, SUSE, Oracle Linux), playing a central role in the global supply chain for secure open source software.

Collaboration between industry and academia

The 2025 edition was supported by FEUP’s Department of Computer Engineering (DEI) as co-organiser, bringing the academic community closer to the people who contribute to the GNU toolchain and other open source software. For three days, software developers, researchers, university professors, engineers and students had the opportunity to attend presentations and debates led by international experts in the field of compilers, toolchains and software language standardisation.

Participants included active contributors to international standard-setting bodies such as ISO C, ISO C++, DWARF, OpenMP, POSIX/IEEE and Rust, contributing directly to the evolution of languages and tools used by millions of programmers around the world.
“It’s a pleasure to be hosting this event for the first time in Portugal and, in particular, in Porto. GNU’s contributions have had a profound impact on teaching, research and technological advancement for the common good,” stressed DEI Director Prof João Paiva Cardoso at the opening session.

Institutional and corporate support

The development of the GNU toolchain is part of the GNU Project and is supported by the FSF and a worldwide community of programmers and corporate sponsors.
The GNU Tools Cauldron 2025 was sponsored and supported by important international companies and institutions: AdaCore, AMD, ARM, BayLibre, Embecosm, NVIDIA, Open Source Security, Synopsys, Pretalx (conference management software), Pretix (ticketing platform) and FEUP, which co-organised and logistically supported the event.

Event website: https://conf.gnu-tools-cauldron.org/opo25/
Videos of all the event sessions: https://www.youtube.com/playlist?list=PL_GiHdX17WtxuKn7QYme8EfbBS-RKSn0w

DEI Talks | “Networks, networks, and more networks: applications in humanities, data science, and machine learning” by Prof. Ana Bazzan

The talk ‘Networks, networks, and more networks: applications in humanities, data science, and machine learning’ will be presented on October 1st, at 14:45, in room B004, moderated by Prof. Rosaldo Rossetti (DEI).

Abstract:

“It is known that networks or graphs can be used in machine learning and data science to represent and analyze data that has complex relationships. Besides these uses, networks are also relevant to the overall AI agenda in at least two aspects. First, it relates to automated data gathering and language models in the semantic web, since the actual data have to be acquired in some manner in order to form the graphs. Second, it can be used to accelerate learning tasks, as in the case of reinforcement learning. In this talk I present examples of how data is acquired and used in applications in the Humanities (history, storytelling) in order to discover patterns and/or to investigate assumptions. Then, I discuss applications on data science and machine learning, as for instance the use of networks in reinforcement learning, with examples from urban mobility and car to infrastructure communication.”

About the Speaker:

Ana Bazzan is a Full Professor of Computer Science at the Institute of Informatics, Universidade Federal do Rio Grande do Sul (UFRGS), in Porto Alegre, Brazil. Her research focuses on multiagent systems, in particular on agent-based modeling and simulation (ABMS), and multiagent learning for the transportation domain. Since 1996, she has collaborated with various researchers in the application of ABMS and game theory to social science domains, such as the emergence of cooperation, the prisoner’s dilemma and public goods games. In recent years, she has contributed to different topics regarding smart cities, focusing on transportation, as well as on the synergies between multiagent systems, machine learning, and complex systems. In 2014, Bazzan was General Co-chair of AAMAS (the premier conference in the area of autonomous agents and multiagent systems).

Free Software Festival 2025

Next week, from October 3rd to 5th, the Faculty of Engineering of the University of Porto (FEUP) will not just host an event, but a practical demonstration of the future of technology. The Free Software Festival 2025, with free admission, goes beyond the concept of a simple conference. It positions itself as an open and essential lesson for students, educators, and entrepreneurs on one of the most important—and often invisible—pillars of the digital world: Free Software.

In an era dominated by expensive licenses and closed ecosystems, FSL serves as a powerful reminder that a more democratic, secure, and flexible alternative exists. But what exactly is the importance of free software, and why is an event like this so crucial for the Portuguese educational and business landscape?

A Lesson in Autonomy and Innovation
At the heart of the free software movement lies a simple yet revolutionary idea: the technology we use should serve us, not the other way around. It is based on four fundamental freedoms: the freedom to use, study, share, and, crucially, modify software. This ability to “look under the hood” transforms a student from a mere consumer of technology into an active creator and problem solver.

For the education system, this represents an immense pedagogical opportunity. Schools and universities can equip their labs with cutting-edge operating systems and programming tools, such as Linux or Blender (for 3D modeling), without spending a single cent on licenses. More importantly, it allows students to explore, deconstruct, and understand the code that powers the digital world, fostering critical thinking and innovation from the ground up. The Free Software Festival embodies this idea, with hands-on workshops where participants can learn to code, protect their online privacy, or take their first steps in Artificial Intelligence, using open tools accessible to all.

The Secret Engine of the Digital Economy
For the business sector, the message is equally clear: free software is not a “second-tier” alternative but the engine that drives technological giants. The internet, as we know it, is largely built on open-source technologies. Adopting free software allows Portuguese companies, from startups to SMEs, to drastically reduce operational costs, but the benefits go far beyond savings. It means technological sovereignty: the ability to adapt software to the exact needs of the business without being dependent on a single vendor and their pricing policies. It also means enhanced security, as the code can be audited by a global community that identifies and fixes vulnerabilities transparently and quickly.

The presence at FSL of entities like ESOP (Association of Portuguese Open Source Software Companies) demonstrates that a vibrant business ecosystem is already thriving in Portugal based on this model. The event thus serves as a bridge, showing future engineers the career opportunities in this sector and entrepreneurs the competitive advantages of a strategic investment in open technology.

An Investment in the Future
In short, the Free Software Festival 2025 is much more than a gathering of enthusiasts. It is an investment in the country’s future. It is living proof that, by embracing the principles of collaboration and open knowledge, Portugal can empower its students to become the innovators of tomorrow and strengthen its companies to compete on a global scale. The class is about to begin, and admission is free.

Check the event program and join us!
https://festa2025.softwarelivre.eu

FSL 2025 is supported by the Department of Informatics of Engineering (DEI).

PhD Defense in Informatics Engineering (ProDEI): ”Generative models for soccer”

Candidate:
Tiago Filipe Mendes Neves

Date, Time and Location:
16 September 2025, 15h30, Sala de Atos, Faculdade de Engenharia da Universidade do Porto

President of the Jury:
Pedro Nuno Ferreira da Rosa da Cruz Diniz (PhD), Full Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto

Members:
Keisuke Fujii (PhD), Associate Professor, Department of Intelligent Systems, Graduate School of Informatics of the Nagoya University, Japan;
Jesse Jon Davis (PhD), Full Professor, Department of Computer Science, Faculty of Engineering Science, Katholieke Universiteit Leuven, Belgium;
Luís Paulo Gonçalves dos Reis (PhD), Associate Professor with Habilitation, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto;
João Pedro Carvalho Leal Mendes Moreira (PhD), Associate Professor, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto (Supervisor).

The thesis was co-supervised by Luís Jorge Machado da Cunha Meireles (PhD), Senior Psychologist & Data Scientist, FC Porto.

Abstract:

Self-supervised large models that disrupt domains such as language, vision, and biology are transforming the world. However, these generative models that learn the underlying data distribution do not perform at the same level on all tasks. For example, Large Language Models (LLMs) do not yet have concrete applicability in soccer analytics. The models lack reasoning capabilities to provide concrete and actionable insights that can compete with the wide range of case-specific metrics within soccer analytics. While there have been some studies exploring the applicability of generative models in soccer, no study aimed for the moonshot of building a complete self-supervised learning model for soccer event data. Let’s consider the individual events (each shot, pass, tackle, …) in a soccer match the “words” that describe what is happening. We can consider each possession a “sentence,” each game an “essay,” and event data as a whole a “language.” By working within this framework, we have all the tools to build a self-supervised model in the same image as LLMs. The goal of this thesis is to build a foundation self-supervised model for soccer event data – termed Large Events Model (LEM) – and demonstrate its real-world applicability and generality in solving a wide range of tasks, such as simulation and modeling, that would otherwise require multiple different approaches. We propose three approaches to building LEMs: a chain of classifiers, causal mask modeling, and sequential language modeling with transformers. First, the chain of classifiers provides the first generative model that models all aspects of event data without posing restrictions on event types, reaching a level of performance that allows large-scale simulation of soccer matches. Then, we investigate two alternative approaches to remove some of the constraints of the first approach. The causal mask modeling approach using multilayer perceptrons reaches the state-of-the-art performance of several of our proposed benchmarks, providing a set of application-ready models to solve a wide range of soccer analytics tasks. We explore a wide range of applications, from automated strategy search with reinforcement learning to risk-reward behaviors of soccer players. More than a dozen use cases for LEMs are present in this thesis. The implications of our work are far-reaching. LEMs have the potential to become the operating system for event data in soccer analytics. They will transform the way clubs work, with easier access to machine learning models that would otherwise require tremendous modeling effort. With LEMs, the barrier to entry will lower significantly as any club in the world can access a model capable of solving its most relevant problems.

Keywords: generative models; foundation models; sports analytics; deep learning applications; simulation; soccer.

PhD Defense in Informatics Engineering (ProDEI): “Text Information Retrieval in Tetun”

Candidate:
Gabriel de Jesus

Date, Time and Location:
1 September 2025, 14:30, Sala de Atos, Faculdade de Engenharia da Universidade do Porto

President of the Jury:
Rui Filipe Lima Maranhão de Abreu (PhD), Full Professor, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto

Members:
Arjen P. de Vries (PhD), Full Professor at the Institute for Computing and Information Sciences of the Radboud Universiteit, Nimega, The Netherlands;
Bruno Emanuel da Graça Martins (PhD), Associate Professor, Departament of Electrical and Computer Engineering, Instituto Superior Técnico da Universidade de Lisboa;
Henrique Daniel de Avelar Lopes Cardoso (PhD), Associate Professor, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto;
Sérgio Sobral Nunes (PhD), Associate Professor, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto (Supervisor).

Abstract:

Ensuring access to information in all languages is crucial for bridging disparities in communities’ participation in the digital age and fostering a more inclusive and equitable society, particularly for speakers of low-resource languages. However, enabling such access remains a significant challenge for many of these communities. Tetun, a language that transitioned from a dialect to one of Timor-Leste’s official languages when the country restored its independence in 2002, faces similar challenges. According to the 2015 census, Tetun is spoken by approximately 79% of the country’s 1.18 million population. Despite its official status, Tetun remains underserved in language technology. Specifically, information retrieval-based solutions for the language do not exist, making it challenging to find relevant information on the internet and digital platforms for text-based search in Tetun.
This work tackles these challenges by investigating retrieval strategies for text-based search that can enable the application of information retrieval techniques to develop search solutions for Tetun, with a specific focus on the ad-hoc text retrieval task. Given that language-specific algorithms, tools, and document collections for Tetun were previously unavailable, this work began by creating these foundational resources, which serve as contributions relevant to information retrieval and natural language processing domains. These resources include a tokenizer, a language identification model, a stemmer, a stopword list, a document collection, a test collection, baselines for the ad-hoc text retrieval task, and a search log dataset. The contributions to information retrieval for low-resource languages include: (1) A data collection pipeline tailored for low-resource languages to streamline the construction of textual data from the web; (2) A human-in-the-loop methodology for annotating, processing, and constructing a dataset well-suited for a variety of information retrieval and natural language processing tasks; (3) A novel network-based approach for stopword detection; (4) Methodologies for developing a stemmer, designed for a language heavily influenced by loanwords, and the construction of a ground truth set for evaluating stemmer performance; (5) A detailed approach for constructing a test collection to evaluate the effectiveness of retrieval systems; (6) A methodology for establishing a robust baseline for the ad-hoc text retrieval task; and (7) Document contextualization and dual-parameter tuning strategies for hybrid text retrieval. The results from this work contribute to the development of technologies associated with the computational processing of Tetun, address gaps in its linguistic resources, and achieve impactful outcomes that elevate Tetun’s status. These advancements open new opportunities for future research and innovation. Moreover, this work introduces promising methodologies that can be adapted to other languages facing similar challenges, thereby contributing to the broader advancement of information retrieval for low-resource languages.