Neuro-Symbolic Representations for Information Retrieval
July 23, full-day tutorial, on-site
Speakers:
Laura Dietz (University of New Hampshire) / Hannah Bast (University of Freiburg) / Shubham Chatterjee (University of Glasgow) / Jeffrey Dalton (University of Glasgow) / Jian-Yun Nie (University of Montreal) / Rodrigo Nogueira (State University of Campinas)
Abstract:
This tutorial will provide an overview of recent advances on neuro-symbolic approaches for information retrieval. A decade ago, knowledge graphs and semantic annotations technology led to active research on how to best leverage symbolic knowledge. At the same time, neural methods have demonstrated to be versatile and highly effective. From a neural network perspective, the same representation approach can service document ranking or knowledge graph reasoning. End-to-end training allows to optimize complex methods for downstream tasks. We are at the point where both the symbolic and the neural research advances are coalescing into neuro-symbolic approaches. The underlying research questions are how to best combine symbolic and neural approaches, what kind of symbolic/neural approaches are most suitable for which use case, and how to best integrate both ideas to advance the state of the art in information retrieval.
Proactive Conversational Agents in the Post-ChatGPT World
July 23, full-day tutorial, on-site
Speakers:
Lizi Liao (Singapore Management University) / Grace Hui Yang (Georgetown University) / Chirag Shah (University of Washington)
Abstract:
ChatGPT and similar large language model (LLM) based conversational agents have brought shock waves to the research world. Although astonished by their human-like performance, we find they share a significant weakness with many other existing conversational agents in that they all take a passive approach in responding to user queries. This limits their capacity to understand the users and the task better and to offer recommendations based on a broader context than a given conversation. Proactiveness is still missing in these agents, including their ability to initiate a conversation, shift topics, or offer recommendations that take into account a more extensive context. To address this limitation, this tutorial reviews methods for equipping conversational agents with proactive interaction abilities.
Neural Methods for Cross-Language Information Retrieval
July 23, half-day tutorial, on-site
Speakers:
Eugene Yang (Human Language Technology Center of Excellence, Johns Hopkins University) / Dawn Lawrie (Johns Hopkins University) / James Mayfield (HLT/COE) / Suraj Nair (University of Maryland) / Douglas W. Oard (University of Maryland)
Abstract:
This half day tutorial will introduce the participant to the basic concepts underlying neural Cross-Language Information Retrieval (CLIR). It will discuss the most common algorithmic approaches to CLIR, focusing on modern neural methods; the history of CLIR; where to find and how to use CLIR training collections, test collections and baseline systems; how CLIR training and test collections are constructed; and open research questions in CLIR.
Causal Recommendation: Progresses and Future Directions
July 23, half-day tutorial, on-site
Speakers:
Wenjie Wang (National University of Singapore) / Yang Zhang (University of Science and Technology of China) / Haoxuan Li (Peking University) / Peng Wu (Beijing Technology and Business University) / Fuli Feng (University of Science and Technology of China) / Xiangnan He (University of Science and Technology of China)
Abstract:
Data-driven recommender systems have demonstrated great successes in various Web applications owing to the extraordinary ability of machine learning models to recognize patterns (i.e., correlation) from the massive historical user behaviors. However, these models still suffer from several issues such as biases and unfairness due to spurious correlations. Considering the causal mechanism behind data can avoid the influences of spurious correlations brought by non-causal relations. In this light, embracing causal recommender modeling is an exciting and promising direction. Therefore, causal recommendation is increasingly drawing attention in our recommendation community. In this tutorial, we aim to introduce the key concepts in causality and provide a systemic review of existing work on causal recommendation. We will introduce existing methods from two different causal frameworks — the potential outcome (PO) framework and the structural causal model (SCM). We will give examples and discussions regarding how to utilize different causal tools under these two frameworks to model and solve problems in recommendation. Moreover, we will summarize the paradigms of PO-based and SCM-based recommendation, respectively, and also provide a comparison between the two lines of work to facilitate understanding the differences and connections between them. Besides, we identify some open challenges and potential future directions for this area. We hope this tutorial could stimulate more ideas on this topic and facilitate the development of causality-aware recommender systems.
Recent Advances in the Foundations and Applications of Unbiased Learning to Rank
July 23, half-day tutorial, on-site
Speakers:
Shashank Gupta (University of Amsterdam, The Netherlands) / Philipp Hager (University of Amsterdam) / Jin Huang (University of Amsterdam, The Netherlands) / Ali Vardasbi (University of Amsterdam, The Netherlands) / Harrie Oosterhuis (Radboud University)
Abstract:
The previous tutorials on Unbiased Learning to Rank (ULTR) focused on introducing the fundamentals of the area to beginners and practitioners. While relevant at the time, the field of ULTR has matured significantly, and fundamental advancements have been made since then. Recently, Learning to Rank (LTR) has also seen significant growth from the application side, including fair LTR. The focus of the previous tutorials was primarily on the fundamentals of ULTR with a limited emphasis on practical applications. To bridge this gap between theory and practice, we plan to discuss recent applications of ULTR in this tutorial. We will discuss the latest interaction biases from the ULTR literature, the latest research in bias correction techniques, and novel applications of ULTR including fair LTR, and finally, conclude with some open questions and future work.
Complex Item Set Recommendation
July 23, half-day tutorial, on-site
Speakers:
Mozhdeh Ariannezhad (University of Amsterdam) / Ming Li (University of Amsterdam) / Sami Jullien (Universiteit van Amsterdam) / Maarten de Rijke (University of Amsterdam)
Abstract:
In this tutorial, we aim to shed light on the task of recommending a set of multiple items at once. In this scenario, historical interaction data between users and items could also be in the form of a sequence of interactions with sets of items. Complex sets of items being recommended together occur in different and diverse domains, such as grocery shopping with so-called baskets and fashion set recommendation with a focus on outfits rather than individual clothing items. We will describe the current landscape of research and expose our participants to real-world examples of item set recommendation. We will further provide our audience with hands-on experience via a notebook session. Finally, we will describe open challenges and call for further research in the area, which we hope will inspire both early stage and more experienced researchers.
Explainable Information Retrieval
July 23, half-day tutorial, on-site
Speakers:
Avishek Anand (Delft University of Technology) / Procheta Sen (University of Liverpool) / Sourav Saha (Indian Statistical Institute) / Manisha Verma (Amazon) / Mandar Mitra (Indian Statistical Institute)
Abstract:
This tutorial presents explainable information retrieval (ExIR), an emerging area focused on fostering responsible and trustworthy deployment of machine learning systems in the context of information retrieval. As the field has rapidly evolved in the past 4-5 years, numerous approaches have been proposed that focus on different access modes, stakeholders, and model development stages. This tutorial aims to introduce IR-centric notions, classification, and evaluation styles in ExIR, while focusing on IR-specific tasks such as ranking, text classification, and learning-to-rank systems. We will delve into method families and their adaptations to IR, extensively covering post-hoc methods, axiomatic and probing approaches, and recent advances in interpretability-by-design approaches. We will also discuss ExIR applications for different stakeholders, such as researchers, practitioners, and end-users, in contexts like web search, patent and legal search, and high-stakes decision-making tasks. To facilitate practical understanding, we will provide a hands-on session on applying ExIR methods, reducing the entry barrier for students, researchers, and practitioners alike.
Uncertainty Quantification for Text Classification
July 23, full-day tutorial, online only
Speakers:
Dell Zhang (Thomson Reuters Labs) / Murat Sensoy (Amazon Alexa AI) / Masoud Makrehchi (Thomson Reuters Labs) / Bilyana Taneva-Popova (Thomson Reuters Labs) / Lin Gui (King’s College London) / Yulan He (King’s College London)
Abstract:
This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Virtual Ensemble in GBDT, Bayesian Deep Learning (including Deep Ensemble, Monte-Carlo Dropout, Bayes by Backprop, and their generalization Epistemic Neural Networks), Evidential Deep Learning (including Prior Networks and Posterior Networks), as well as Distance Awareness (including Spectral-normalized Neural Gaussian Process and Deep Deterministic Uncertainty). Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning). After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.