Entity-Relationship Retrieval over the Web

Lecture DEI Series

(together with the Language Processing and Information Extraction course from ProDEI)

Date: December 19th

Time: 17:00

Room: I-105

Speaker: Pedro Saleiro

Affiliation: University of Chicago, USA


Entity-Relationship (E-R) Search is a complex case of Entity Search where the goal is to search for multiple unknown entities and relationships connecting them. We assume that a E-R query can be decomposed as a sequence of sub-queries each containing keywords related to a specific entity or relationship. We adopt a probabilistic formulation of the E-R search problem. When creating specific representations for entities (e.g. context terms) and for pairs of entities (i.e. relationships) it is possible to create a graph of probabilistic dependencies between sub-queries and entity plus relationship representations. To the best of our knowledge this represents the first probabilistic model of E-R search. We propose and develop a novel supervised Early Fusion-based model for E-R search, the Entity-Relationship Dependence Model (ERDM). It uses Markov Random Field to model term dependencies of E-R sub-queries and entity/relationship documents. We performed experiments with more than 800M entities and relationships extractions from ClueWeb-09-B with FACC1 entity linking. We obtained promising results using 3 different query collections comprising 469 E-R queries, with results showing that it is possible to perform E-R search without using fix and pre-defined entity and relationship types, enabling a wide range of queries to be addressed.

Short Bio

Pedro Saleiro is currently working on methods to detect, audit and reduce bias and discrimination in machine learning models, especially in public policy problems. He is working with Rayid Ghani and his fantastic team as a postdoc at the University of Chicago. His research interests lie at the intersection of Data Science and Machine Learning, Information Retrieval and NLP + Knowledge Graphs, and other related AI subdomains.


