ECMLPKDD 2025 – European Conference on Machine Learning and Knowledge Discovery in Databases

The European Conference on Machine Learning and Knowledge Discovery in Databases (ECMLPKDD), the largest European event dedicated to Machine Learning and one of the most important worldwide in the field of Artificial Intelligence, took place in Porto between 15 and 19 September 2025.

The organisation counted on the active participation of several students and researchers from the Department of Informatics Engineering (DEI) of the Faculty of Engineering of the University of Porto (FEUP), notably Carlos Soares (General Co-Chair) and João Mendes Moreira (Workshops Co-Chair).

Held at Alfândega do Porto, the conference brought together more than 1,300 participants from 60 countries, including around 450 students. The programme featured seven keynote speeches by some of the world’s leading researchers in the field, including Pedro Domingos (University of Washington), Cynthia Rudin (Duke University), Mirella Lapata (University of Edinburgh), Francisco Herrera (University of Granada), Sašo Džeroski (Jožef Stefan Institute) and Nuria Oliver (ELLIS Alicante – Institute of Humanity-Centric AI).

The scientific programme included around 400 accepted papers, in addition to numerous presentations at the 32 workshops associated with the event, reflecting the vitality and diversity of the European Machine Learning and Data Science community.

ECMLPKDD 2025 was supported by several national and international companies, including BNP Paribas, EDF, Google, ASML, NOS, NEC, Amazon, AstraZeneca and Banco de Portugal, as well as an institutional partnership with Porto Digital.

More information on the ECMLPKDD website.

DEI Talks | “smtgcc: Using an SMT solver to find bugs in GCC” by Krister Walfridsson

The talk “smtgcc: Using an SMT solver to find bugs in GCC” will be presented by Krister Walfridsson on December the 4th, at 16:00, online:

Join the meeting 
Meeting ID: 373 912 942 228 7
Passcode: XS9M8dT3

Abstract:

“SMT solvers are increasingly effective for finding compiler bugs and validating optimizations. This talk presents smtgcc, a translation-validation tool for GCC. It is similar to Alive2 for LLVM, but smtgcc’s approach diverges from Alive2 because GCC and LLVM follow different design choices. I will explain how smtgcc works and discuss issues in formalizing the semantics of GIMPLE, GCC’s IR.”

About the Speaker:

Krister Walfridsson became involved with the GCC project while studying at university in the mid-1990s. Since completing his studies, he has worked with both compilers and operating systems in various embedded environments. Most recently, he spent 10 years at Arm as a principal compiler engineer in the Mali GPU team. He is currently taking a few years off to work on personal projects and to dance.

DEI Talks | “Knowledge Graphs + AI: The Evolution of Automated GitHub Issue Resolution” by Prof. He Ye (University College London)

The talk entitled “Knowledge Graphs + AI: The Evolution of Automated GitHub Issue Resolution“, will be presented by Prof. He Ye on November 17th, at 14:30, in room B008, moderated by Prof. Alexandra Mendes (DEI).

Abstract:

“AI coding agents are becoming increasingly capable, achieving strong results on benchmarks such as SWE-bench. However, most still struggle with real-world challenges such as issue reproduction, precise context retrieval from large codebases, and the high cost of LLMs. In this talk, I will introduce our recent code agent, Prometheus — a knowledge graph-powered, multi-agent system designed to tackle GitHub issues in practice. Prometheus transforms entire repositories into a unified knowledge graph stored in Neo4j for scalable and structured reasoning. This enables precise, cross-language context retrieval, allowing large language models to generate accurate and efficient fixes. Prometheus delivers robust performance, resolving diverse issues across seven programming languages. I will show how combining LLMs with knowledge graphs can advance automated issue resolution beyond today’s benchmark-driven limits. We have recently transitioned this research into an off-the-shelf product that helps industry resolve software issues automatically.”

About the Speaker:

He Ye is an Assistant Professor at University College London. She previously worked as a Postdoctoral Researcher at Carnegie Mellon University and received her PhD from KTH Royal Institute of Technology. Her research centers on developing the next generation of code agents to automate software engineering tasks, with a focus on codebase context retrieval, automated issue resolution, and code agent memory construction. Beyond academia, she is the co-founder of EuniAI, a startup committed to turning research into real-world solutions that help developers address practical software challenges.

DEI Talks | “Energy-awareness in compute acceleration: The role of FPGAs” by Prof. Shreejith Shanker

The talk entitled “Energy-awareness in compute acceleration: The role of FPGAs“, will be presented by Prof. Shreejith Shanker on October 30, at 11:30, in room B012, and will be moderated by Prof. Tiago Carvalho (DEI).

Abstract:

“The talk will cover a set of projects that my team at TCD is working on, spanning embedded and distributed systems to high-performance media workflows, and how FPGAs are enabling an energy-performance trade-off in these applications.”

About the Speaker:

Dr. Shreejith Shanker is an Assistant Professor of Reconfigurable Computing at Trinity College Dublin, Ireland and leads the research group on reconfigurable architectures, accelerators and workflows. His research interests include reconfigurable and adaptive computing architectures, in-network computing, post-production media workflows, design automation tools and distributed embedded systems, with a focus on performance-energy trade-off and hardware-software codesign approaches.

DEI Talks | “Declarative Programming” by Steven Pemberton (ACM Distinguished Speaker)

The talk “Declarative Programming” will be delivered by Steven Pemberton, a renowned researcher in Computer Science and Information Technology and an ACM Distinguished Speaker, on October 23rd at 10:00, in room B033, and will be moderated by Prof. João Ferreira (DEI). Admission is free.

Abstract:

“In the 50s, when the first programming languages were designed, computers cost millions, and relatively, programmers were almost free. Those programming languages therefore reflected that relationship: it didn’t matter if it took a long time to program, as long as the resulting program ran as fast as possible.
Now, that relationship has been reversed, which I call Moore’s Switch: compare to the cost of programmers, computers are almost free.
And yet we are still programming in descendants of the programming languages from the 50s: we are still telling the computers step by step how to solve the program.
Declarative programming is a new approach to applications: rather than describing exactly how to reach the solution, it describe what the solution should look like, and leaves more of the administrative parts of the program to the computer.
One of the few declarative languages available is XForms, an XML-based language that despite what its name might suggest is not only about form. Large projects, at large companies such as the National Health Service, the BBC and Xerox, have shown that by using XFoms, programming time and cost of application can be reduced to a tenth and sometimes even much more.”

About the Speaker:

Steven Pemberton is a distinguished researcher in the field of computer science and information technology, with a long and rich history of contributions to the development of the internet and the web. He is affiliated with the Dutch national research centre Centrum Wiskunde & Informatica (CWI) in Amsterdam, The Netherlands, where he conducts research on interaction, declarative programming, and web technologies.
At university he was tutored by Dick Grimsdale who built the world’s first transistorised computer, and who was himself a tutee of Alan Turing. After university, Pemberton — coincidentally — worked in Turing’s old department in Manchester, writing software for the 5th computer in the line of computers Turing had worked on.
Pemberton was the first user of the open internet in Europe when the CWI created the first connection in 1988, and has been involved with the web from its inception, co-designing several web standards, including HTML, CSS, XHTML, XForms, and RDFa. He chairs two groups at W3C.
In addition to his work on the web, Pemberton has also made significant contributions to other areas of computer science, such as the design of programming languages, having co-designed the language that Python is based on, and the study of human-computer interaction. His involvement with ACM includes being editor in chief of The SIGCHI Bulletin, and then ACM interactions for a decade; he has chaired the CHI Conference and he co-founded the Netherlands local SIGCHI group, and chaired several local CHI conferences there.
He has received numerous awards and recognitions for his work, including the ACM SIGCHI Lifetime Service Award and the ACM SIGCHI Lifetime Practice Award.
As a speaker, Pemberton is known for his engaging and informative presentations, which draw on his deep knowledge of computer science and his passion for technology, and cover both social and technological aspects of computing. His talks are always thought-provoking and entertaining, and he has been invited to speak at numerous conferences and events around the world. In 2023 he became an ACM Distinguished Speaker. He is bi-lingual in English and Dutch.
A fuller bio, videos, and a full list of talks is available on his website: https://www.cwi.nl/~steven”

DEI Talks | “Software process modeling and test automation: Introducing the Reliable Software Architectures Research Group” by Prof. Přemek Brada

The talk “Software process modeling and test automation: Introducing the Reliable Software Architectures Research Group” will be presented October the 9th, at 15:30, room B031, and will be moderated by Prof. Ana Paiva (DEI).

Abstract:

“In this talk, I will give an overview of research done by the Reliable Software Architectures Research Group at the University of West Bohemia in Pilsen, Czechia. The focus will be on analysing software process data to detect project management (anti-)patterns, where we’ll discuss the challenges in modeling software process elements in a way that is conducive to mapping onto the information gathered in project management tools. We’ll also touch the topic of analyzing software implementations to perform advanced verification and testing.”

About the Speaker:

Přemek Brada is an Associate Professor in Software Engineering at the Department of Computer Science and Engineering, University of West Bohemia in Pilsen, Czechia.  His research has covered the areas of software architecture consistency, interactive methods of architecture visualization, and software development methodologies including analysis of related process data.  He teaches bachelor and master level courses on object-oriented design and modeling, advanced software engineering practices, and also knowledge management. Currently he serves as the head of department, and is a member of the Board of Informatics Europe, the association of European informatics faculties and departments.

GNU Tools Cauldron 2025 brought together international experts at FEUP

From 26 to 28 September, the Faculty of Engineering of the University of Porto (FEUP) hosted the 14th edition of the GNU Tools Cauldron, a world reference technical conference dedicated to the GNU Toolchain and associated open source development tools.

This international meeting was held for the first time in Portugal and brought together around 140 participants from more than a dozen countries, including Canada, Germany, the Czech Republic, the United Kingdom, Ireland, Portugal, the Netherlands, France, India, the United States, Belgium, China, South Africa and Brazil.

An event with history and global impact

Created in 2012, the GNU Tools Cauldron has been organised annually, passing through some of the world’s most prestigious universities, such as the University of Cambridge (UK), Charles University (Czech Republic) and the University of Manchester (UK), and now arriving at the University of Porto. Throughout its history, the event has taken place in cities such as Mountain View, Prague, Cambridge, Manchester, Hebden Bridge, Montreal and Porto. The aim of organising the conference in partnership with higher education institutions is to strengthen the link between the international open source development community and academia, promoting the direct involvement of students and researchers.

This technical conference focuses on the GNU Toolchain – which includes fundamental tools such as gcc and gdb, and utilities and libraries such as binutils and glibc – and associated projects (ltrace, poke, systemtap, valgrind, among others). It is a critical ecosystem for most of the reference Linux distributions (AlmaLinux, CentOS Stream, Debian, Fedora, Gentoo, RHEL, Rocky Linux, SUSE, Oracle Linux), playing a central role in the global supply chain for secure open source software.

Collaboration between industry and academia

The 2025 edition was supported by FEUP’s Department of Computer Engineering (DEI) as co-organiser, bringing the academic community closer to the people who contribute to the GNU toolchain and other open source software. For three days, software developers, researchers, university professors, engineers and students had the opportunity to attend presentations and debates led by international experts in the field of compilers, toolchains and software language standardisation.

Participants included active contributors to international standard-setting bodies such as ISO C, ISO C++, DWARF, OpenMP, POSIX/IEEE and Rust, contributing directly to the evolution of languages and tools used by millions of programmers around the world.
“It’s a pleasure to be hosting this event for the first time in Portugal and, in particular, in Porto. GNU’s contributions have had a profound impact on teaching, research and technological advancement for the common good,” stressed DEI Director Prof João Paiva Cardoso at the opening session.

Institutional and corporate support

The development of the GNU toolchain is part of the GNU Project and is supported by the FSF and a worldwide community of programmers and corporate sponsors.
The GNU Tools Cauldron 2025 was sponsored and supported by important international companies and institutions: AdaCore, AMD, ARM, BayLibre, Embecosm, NVIDIA, Open Source Security, Synopsys, Pretalx (conference management software), Pretix (ticketing platform) and FEUP, which co-organised and logistically supported the event.

Event website: https://conf.gnu-tools-cauldron.org/opo25/
Videos of all the event sessions: https://www.youtube.com/playlist?list=PL_GiHdX17WtxuKn7QYme8EfbBS-RKSn0w

DEI Talks | “Networks, networks, and more networks: applications in humanities, data science, and machine learning” by Prof. Ana Bazzan

The talk ‘Networks, networks, and more networks: applications in humanities, data science, and machine learning’ will be presented on October 1st, at 14:45, in room B004, moderated by Prof. Rosaldo Rossetti (DEI).

Abstract:

“It is known that networks or graphs can be used in machine learning and data science to represent and analyze data that has complex relationships. Besides these uses, networks are also relevant to the overall AI agenda in at least two aspects. First, it relates to automated data gathering and language models in the semantic web, since the actual data have to be acquired in some manner in order to form the graphs. Second, it can be used to accelerate learning tasks, as in the case of reinforcement learning. In this talk I present examples of how data is acquired and used in applications in the Humanities (history, storytelling) in order to discover patterns and/or to investigate assumptions. Then, I discuss applications on data science and machine learning, as for instance the use of networks in reinforcement learning, with examples from urban mobility and car to infrastructure communication.”

About the Speaker:

Ana Bazzan is a Full Professor of Computer Science at the Institute of Informatics, Universidade Federal do Rio Grande do Sul (UFRGS), in Porto Alegre, Brazil. Her research focuses on multiagent systems, in particular on agent-based modeling and simulation (ABMS), and multiagent learning for the transportation domain. Since 1996, she has collaborated with various researchers in the application of ABMS and game theory to social science domains, such as the emergence of cooperation, the prisoner’s dilemma and public goods games. In recent years, she has contributed to different topics regarding smart cities, focusing on transportation, as well as on the synergies between multiagent systems, machine learning, and complex systems. In 2014, Bazzan was General Co-chair of AAMAS (the premier conference in the area of autonomous agents and multiagent systems).

Free Software Festival 2025

Next week, from October 3rd to 5th, the Faculty of Engineering of the University of Porto (FEUP) will not just host an event, but a practical demonstration of the future of technology. The Free Software Festival 2025, with free admission, goes beyond the concept of a simple conference. It positions itself as an open and essential lesson for students, educators, and entrepreneurs on one of the most important—and often invisible—pillars of the digital world: Free Software.

In an era dominated by expensive licenses and closed ecosystems, FSL serves as a powerful reminder that a more democratic, secure, and flexible alternative exists. But what exactly is the importance of free software, and why is an event like this so crucial for the Portuguese educational and business landscape?

A Lesson in Autonomy and Innovation
At the heart of the free software movement lies a simple yet revolutionary idea: the technology we use should serve us, not the other way around. It is based on four fundamental freedoms: the freedom to use, study, share, and, crucially, modify software. This ability to “look under the hood” transforms a student from a mere consumer of technology into an active creator and problem solver.

For the education system, this represents an immense pedagogical opportunity. Schools and universities can equip their labs with cutting-edge operating systems and programming tools, such as Linux or Blender (for 3D modeling), without spending a single cent on licenses. More importantly, it allows students to explore, deconstruct, and understand the code that powers the digital world, fostering critical thinking and innovation from the ground up. The Free Software Festival embodies this idea, with hands-on workshops where participants can learn to code, protect their online privacy, or take their first steps in Artificial Intelligence, using open tools accessible to all.

The Secret Engine of the Digital Economy
For the business sector, the message is equally clear: free software is not a “second-tier” alternative but the engine that drives technological giants. The internet, as we know it, is largely built on open-source technologies. Adopting free software allows Portuguese companies, from startups to SMEs, to drastically reduce operational costs, but the benefits go far beyond savings. It means technological sovereignty: the ability to adapt software to the exact needs of the business without being dependent on a single vendor and their pricing policies. It also means enhanced security, as the code can be audited by a global community that identifies and fixes vulnerabilities transparently and quickly.

The presence at FSL of entities like ESOP (Association of Portuguese Open Source Software Companies) demonstrates that a vibrant business ecosystem is already thriving in Portugal based on this model. The event thus serves as a bridge, showing future engineers the career opportunities in this sector and entrepreneurs the competitive advantages of a strategic investment in open technology.

An Investment in the Future
In short, the Free Software Festival 2025 is much more than a gathering of enthusiasts. It is an investment in the country’s future. It is living proof that, by embracing the principles of collaboration and open knowledge, Portugal can empower its students to become the innovators of tomorrow and strengthen its companies to compete on a global scale. The class is about to begin, and admission is free.

Check the event program and join us!
https://festa2025.softwarelivre.eu

FSL 2025 is supported by the Department of Informatics of Engineering (DEI).

PhD Defense in Informatics Engineering (ProDEI): ”Generative models for soccer”

Candidate:
Tiago Filipe Mendes Neves

Date, Time and Location:
16 September 2025, 15h30, Sala de Atos, Faculdade de Engenharia da Universidade do Porto

President of the Jury:
Pedro Nuno Ferreira da Rosa da Cruz Diniz (PhD), Full Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto

Members:
Keisuke Fujii (PhD), Associate Professor, Department of Intelligent Systems, Graduate School of Informatics of the Nagoya University, Japan;
Jesse Jon Davis (PhD), Full Professor, Department of Computer Science, Faculty of Engineering Science, Katholieke Universiteit Leuven, Belgium;
Luís Paulo Gonçalves dos Reis (PhD), Associate Professor with Habilitation, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto;
João Pedro Carvalho Leal Mendes Moreira (PhD), Associate Professor, Departament of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto (Supervisor).

The thesis was co-supervised by Luís Jorge Machado da Cunha Meireles (PhD), Senior Psychologist & Data Scientist, FC Porto.

Abstract:

Self-supervised large models that disrupt domains such as language, vision, and biology are transforming the world. However, these generative models that learn the underlying data distribution do not perform at the same level on all tasks. For example, Large Language Models (LLMs) do not yet have concrete applicability in soccer analytics. The models lack reasoning capabilities to provide concrete and actionable insights that can compete with the wide range of case-specific metrics within soccer analytics. While there have been some studies exploring the applicability of generative models in soccer, no study aimed for the moonshot of building a complete self-supervised learning model for soccer event data. Let’s consider the individual events (each shot, pass, tackle, …) in a soccer match the “words” that describe what is happening. We can consider each possession a “sentence,” each game an “essay,” and event data as a whole a “language.” By working within this framework, we have all the tools to build a self-supervised model in the same image as LLMs. The goal of this thesis is to build a foundation self-supervised model for soccer event data – termed Large Events Model (LEM) – and demonstrate its real-world applicability and generality in solving a wide range of tasks, such as simulation and modeling, that would otherwise require multiple different approaches. We propose three approaches to building LEMs: a chain of classifiers, causal mask modeling, and sequential language modeling with transformers. First, the chain of classifiers provides the first generative model that models all aspects of event data without posing restrictions on event types, reaching a level of performance that allows large-scale simulation of soccer matches. Then, we investigate two alternative approaches to remove some of the constraints of the first approach. The causal mask modeling approach using multilayer perceptrons reaches the state-of-the-art performance of several of our proposed benchmarks, providing a set of application-ready models to solve a wide range of soccer analytics tasks. We explore a wide range of applications, from automated strategy search with reinforcement learning to risk-reward behaviors of soccer players. More than a dozen use cases for LEMs are present in this thesis. The implications of our work are far-reaching. LEMs have the potential to become the operating system for event data in soccer analytics. They will transform the way clubs work, with easier access to machine learning models that would otherwise require tremendous modeling effort. With LEMs, the barrier to entry will lower significantly as any club in the world can access a model capable of solving its most relevant problems.

Keywords: generative models; foundation models; sports analytics; deep learning applications; simulation; soccer.