PhD Defence in Informatics Engineering: ”Intelligent Ticket Management Assistant for Helpdesk Operations” – DEI

Candidate:

Leonardo da Silva Ferreira

Date, Time and Location:

13th of June 2025, 9:30, Sala de Atos, Faculdade de Engenharia da Universidade do Porto

President of the Jury:

Pedro Nuno Ferreira da Rosa da Cruz Diniz, PhD, Full Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto

Members:

Pedro Manuel Henriques da Cunha Abreu, PhD, Associate Professor with habilitation, Department of Informatics Engineering, Faculdade de Ciência e Tecnologia da Universidade de Coimbra;

Paulo Jorge Freitas de Oliveira Novais, PhD, Full Professor, Department of Computer Science, Escola de Engenharia da Universidade do Minho;

Carlos Manuel Milheiro de Oliveira Pinto Soares, PhD, Associate Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto;

Ana Paula Cunha da Rocha, PhD, Associate Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto;

Daniel Augusto Gama de Castro Silva, PhD, Assistant Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto (Supervisor).

The thesis was co-supervised by Professor Mikel Uriarte Itzazelaia, Associate Professor at the Escuela de Ingeniería de Bilbao, Universidad del País Vasco.

Abstract:

With the dynamic evolution of the internet, particularly in domains such as multimedia services, cloud computing, internet of things, virtualization, and artificial intelligence, companies have witnessed significant expansion in their market and services. However, this growth has also exposed numerous vulnerabilities that threaten the confidentiality, integrity, and availability of organizational and personal data. As information technology analysts work to address security system alerts, artificial intelligence has introduced new avenues for breaching security, ranging from simple, low-cost methods to highly sophisticated attacks. Low-cost approaches include phishing and password spraying, which exploit human error and weak password practices. In contrast, more complex threats include advanced persistent attacks and zero-day exploits, which require significant expertise and resources, often disrupting critical systems. Many organizations rely on cybersecurity helpdesk centers, internal or outsourced, to manage incidents. However, these centers often struggle to respond effectively due to data overload and a lack of qualified operators.

This dissertation addresses the shortage of skilled operators and the high volume of incidents in helpdesk operations by developing a ticket management assistant to support human operators in resolving incidents. The framework integrates a context-aware recommender system that identifies the fastest analyst-procedure pair for each incident and continually improves with each treatment followed. To ensure data privacy, this recommender system is trained using artificial data generated by a custom synthetic data generator. Furthermore, this thesis explores the possibility of enhancing this assistant with automated machine learning functionalities to predict incoming tickets. This feature could help managers anticipate workloads and proactively adjust the composition of the security teams.

The development of this framework is supported by the collaboration with a cybersecurity company, S21sec, which provides anonymized historical incident treatment data structures and taxonomies. However, synthetic data generation techniques are essential due to the absence of granular information on incident resolution and related parameters in the shared data set, which also requires privacy. The implemented generator builds artificial datasets that can mimic distributions similar to those observed in the real dataset while emulating real-world behaviours, including ticket prioritization, scheduling, and treatment.

The artificial data generator is evaluated for its efficiency in replicating real-world datasets using similarity measures such as Hellinger distance and Kullback-Leibler divergence. Furthermore, several ticket scheduling scenarios are explored, varying operators’ numbers and distribution across three work shifts. The results demonstrate that this framework can replicate ticket distributions and treatment durations observed in real datasets. Additionally, it allows for the simulation of real-world helpdesk operations, providing a solid foundation for exploring diverse operational contexts without compromising privacy. The analysis of the ticket scheduling consistently shows that scenarios characterized by a high shift imbalance and fewer operators lead to longer wait times and more tickets scheduled for later treatment.

The recommender system is assessed from two perspectives: scalability and impact on ticket treatment. The first phase uses various test datasets with different sizes and numbers of operators, analyzed with metrics such as the average recommendation time and memory usage. In contrast, the impact on ticket treatment is examined by considering improvements in ticket waiting times before being allocated to an operator and the response time required for their resolution, using different recommendation acceptance degrees. The results indicate that the number of operators the recommender system utilizes has a slightly larger impact on its scalability than the number of test tickets. Both features show a similar linear growth pattern regarding the referred metrics, but the number of operators has a larger slope. Integrating this recommender system into the ticket treatment reduced the average response time by 37.9\% to 45.1\% and the average wait time by 62.2\% to 63.2\%, assuming operators always accept the recommendations. With varying recommendation acceptance rates, the average wait time remains constant, while the response time improvement ranges from 0.4\% to 11.7\%.

The potential application of automated machine learning for predictive analysis is explored through a case study, comparing the system’s recommended team dimensionality decisions with expected outcomes. The case study evaluates the system based on prediction accuracy and its ability to suggest team size adjustments. Among the tested dataset distributions, models trained in three years of data outperformed those trained on four years, showing a better mean average error using real data on ticket frequency throughout the year. Regarding team dimensionality recommendations, including hiring or dismissing operators, the tool-based on automated machine learning frequently proposed decisions closely aligned with those that could have been proposed in the same period.

Collectively, these results show that the proposed framework can optimize ticket treatment workflows in real-world applications, leading to more efficient use of resources and reduced operational delays. Furthermore, its ability to simulate real-world operations without compromising privacy allows security operations centers to test several scenarios and refine their strategies.

Keywords: Helpdesk; Ticket; Cybersecurity; Synthetic Data; Recommendation Systems.