Tutorials

Thursday, July 7

11:00 - 14:30, Online (Half Day)

Monday, July 11

9:00 - 17:00, On site (Full Day)

9:00 - 12:30, On site (Half Day)

13:30 - 17:00, On site (Half Day)

15:30 - 19:00, Online (Half Day)


Full day tutorials


Examining User Behaviour in Information Retrieval

Organizers: George Buchanan and Dana Mckay

Abstract:

Conducting studies involving actual users is a recurring challenge in information retrieval. In this tutorial we will address the main strategic and tactical choices for engaging with, designing and executing user studies, considering both evaluation and formative investigation. The tension between reproducibility and ensuring natural user behaviour will be a recurring focus, seeking to help individual researchers make an intentional and well-argued choice for their research. The presenters have over fifty years of combined experience working in interactive information retrieval, and information interaction in general.

Date: Monday, July 11, 9:00 - 17:00, On site (Full Day)

Room: «Sala de Columnas», Floor: 4


Conversational Information Seeking: Theory and Application

Organizers: Jeff Dalton, Sophie Fischer, Paul Owoicho, Filip Radlinski, Federico Rossetto, Johanne R. Trippas and Hamed Zamani

Abstract:

Conversational information seeking (CIS) involves interaction sequences between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. CIS recently attracted significant attention and advancements continue to be made. This tutorial follows the content of the recent Conversational Information Seeking book authored by several of the tutorial presenters. The tutorial aims to be an introduction to CIS for newcomers to CIS in addition to the recent advanced topics and state-of-the-art approaches for students and researchers with moderate knowledge of the topic. A significant part of the tutorial is dedicated to hands-on experiences based on toolkits developed by the presenters for conversational passage retrieval and multi-modal task-oriented dialogues. The outcomes of this tutorial include theoretical and practical knowledge, including a forum to meet researchers interested in CIS.

Date: Monday, July 11, 9:00 - 17:00, On site (Full Day)

Room: «Sala Gómez de la Serna», Floor: 5

Half-day tutorials


Improving Efficiency and Robustness of Transformer-based Information Retrieval Systems

Organizers: Edmon Begoli, Sudarshan Srinivasan and Maria Mahbub

This tutorial focuses on both theoretical and practical aspects of improving the efficiency and robustness of transformer-based approaches, so that these can be effectively used in practical, high-scale, and high-volume information retrieval (IR) scenarios. The tutorial is inspired and informed by our work and experience while working with massive narrative datasets (8.5 billion medical notes), and by our basic research and academic experience with transformer-based IR tasks.

Additionally, the tutorial focuses on techniques for making transformer-based IR robust against adversarial (AI) exploitation. This is a recent concern in the IR domain that we needed to take into concern, and we want to want to share some of the lessons learned and applicable principles with our audience.

Finally, an important, if not critical, element of this tutorial is its focus on didacticism – delivering tutorial content in a clear, intuitive, plain-speak fashion. Transformers are a challenging subject, and, through our teaching experience, we observed a great value and a great need to explain all relevant aspects of this architecture and related principles in the most straightforward, precise, and intuitive manner. That is the defining style of our proposed tutorial.

Website: https://github.com/ebegoli/SIGIR2020-Efficient-Transfomers

Date: Monday, July 11, 9:00 - 12:30, On site (Half Day)

Room: «Cine», Floor: -1


Gender Fairness in Information Retrieval Systems

Organizers: Amin Bigdeli, Negar Arabzadeh, Shirin Seyedsalehi, Morteza Zihayat and Ebrahim Bagheri

Abstract:

Recent studies have shown that it is possible for stereotypical gender biases to find their way into representational and algorithmic aspects of retrieval methods; hence, exhibit themselves in retrieval outcomes. In this tutorial, we inform the audience of various studies that have systematically reported the presence of stereotypical gender biases in Information Retrieval (IR) systems. We further classify existing work on gender biases in IR systems as being related to (1) relevance judgement datasets, (2) structure of retrieval methods, and (3) representations learnt for queries and documents. We present how each of these components can be impacted by or cause intensified biases during retrieval. Based on these identified issues, we then present a collection of approaches from the literature that have discussed how such biases can be measured, controlled, or mitigated. Additionally, we introduce publicly available datasets that are often used for investigating gender biases in IR systems as well as evaluation methodology adopted for determining the utility of gender bias mitigation strategies.

Website: https://ls3.rnet.ryerson.ca/sigir_2022_tutorial/

Date: Monday, July 11, 9:00 - 12:30, On site (Half Day)

Room: «Sala María Zambrano», Floor: 5


Towards Reproducible Machine Learning Research in Information Retrieval

Organizers: Ana Lucic, Maurits Bleeker, Maarten de Rijke, Koustuv Sinha, Sami Jullien and Robert Stojnic

Abstract:

While recent progress in the field of machine learning (ML) and information retrieval (IR) has been significant, the reproducibility of these cutting-edge results is often lacking, with many submissions failing to provide the necessary information in order to ensure subsequent reproducibility. Despite the introduction of self-check mechanisms before submission (such as the Reproducibility Checklist, criteria for evaluating reproducibility during reviewing at several major conferences, artifact review and badging framework, and dedicated reproducibility tracks and challenges at major IR conferences, the motivation for executing reproducible research is lacking in the broader information community. We propose this tutorial as a gentle introduction to help ensure reproducible research in IR, with a specific emphasis on ML aspects of IR research.

Date: Monday, July 11, 13:30 - 17:00, On site (Half Day)

Room: «Cine», Floor: -1


Recent Advances in Retrieval-Augmented Text Generation

Organizers: Deng Cai, Yan Wang, Lemao Liu and Shuming Shi

Abstract:

Recently retrieval-augmented text generation has achieved state-of-the-art performance in many NLP tasks and has attracted increasing attention of the NLP and IR community, this tutorial thereby aims to present recent advances in retrieval-augmented text generation comprehensively and comparatively. It firstly highlights the generic paradigm of retrieval-augmented text generation, then reviews notable works for different text generation tasks including dialogue generation, machine translation, and other generation tasks, and finally points out some limitations and shortcomings to facilitate future research.

Website: https://jcyk.github.io/RetGenTutorial/

Date: Thursday, July 7, 11:00 - 14:30, Online (Half Day)


Sequential/Session-based Recommendations: Challenges, Approaches, Applications and Opportunities

Organizers: Shoujin Wang, Qi Zhang, Liang Hu, Xiuzhen Zhang, Yan Wang and Charu Aggarwal

Abstract:

In recent years, sequential recommender systems (SRSs) and session-based recommender systems (SBRSs) have emerged as a new paradigm of RSs to capture users’ short-term but dynamic preferences for enabling more timely and accurate recommendations. Although SRSs and SBRSs have been extensively studied, there are many inconsistencies in this area caused by the diverse descriptions, settings, assumptions and application domains. There is no work to provide a unified framework and problem statement to remove the commonly existing and various inconsistencies in the area of SR/SBR. There is a lack of work to provide a comprehensive and systematic demonstration of the data characteristics, key challenges, most representative and state-of-the-art approaches, typical real-world applications and important future research directions in the area. This work aims to fill in these gaps so as to facilitate further research in this exciting and vibrant area.

Website: https://neurec22.github.io/SRS&SBRS/

Date: Thursday, July 7, 11:00 - 14:30, Online (Half Day)


Continual Learning Dialogue Systems: Learning during Conversation

Organizers: Sahisnu Mazumder and Bing Liu

Abstract:

Dialogue systems, commonly known as Chatbots, have gained escalating popularity in recent years due to their wide-spread applications in carrying out chit-chat conversations with users and accomplishing various tasks as personal assistants. However, they still have some major weaknesses. One key weakness is that they are typically trained from pre-collected and manually-labeled data and/or written with handcrafted rules. Their knowledge bases (KBs) are also fixed and pre-compiled by human experts. Due to the huge amount of manual effort involved, they are difficult to scale and also tend to produce many errors ought to their limited ability to understand natural language and the limited knowledge in their KBs. Thus, when these systems are deployed, the level of user satisfactory is often low.

In this tutorial, we introduce and discuss methods to give chatbots the ability to continuously and interactively learn new knowledge during conversation, i.e. ``on-the-job”””” by themselves so that as the systems chat more and more with users, they become more and more knowledgeable and improve their performance over time. The first half of the tutorial focuses on introducing the paradigm of lifelong and continual learning and discuss various related problems and challenges in conversational AI applications. In the second half, we present recent advancements on the topic, with a focus on continuous lexical and factual knowledge learning in dialogues, open-domain dialogue learning after deployment and learning of new language expressions via user interactions for language grounding applications (e.g. natural language interfaces). Finally, we conclude with a discussion on the scopes for continual conversational skill learning and present some open challenges for future research.

Website: https://www.cs.uic.edu/~liub/SIGIR-2022-CL-Chatbot.html

Date: Monday, July 11, 15:30 - 19:00, Online (Half Day)


Self-Supervised Learning for Recommender System

Organizers: Chao Huang, Xiang Wang, Xiangnan He and Dawei Yin

Abstract:

Recommender systems have become key components for a wide spectrum of web applications (e.g., E-commerce sites, video sharing platforms, lifestyle applications, etc), so as to alleviate the information overload and suggest items for users. However, most existing recommendation models follow a supervised learning manner, which notably limits their representation ability with the ubiquitous sparse and noisy data in practical applications. Recently, self-supervised learning (SSL) has become a promising learning paradigm to distill informative knowledge from unlabeled data, without the heavy reliance on sufficient supervision signals. Inspired by the effectiveness of self-supervised learning, recent efforts bring SSL’s superiority into various recommendation representation learning scenarios with augmented auxiliary learning tasks. In this tutorial, we aim to provide a systemic review of existing self-supervised learning frameworks and analyze the corresponding challenges for various recommendation scenarios, such as general collaborative filtering paradigm, social recommendation, sequential recommendation, and multi-behavior recommendation. We then raise discussions and future directions of this area. With the introduction of this emerging and promising topic, we expect the audience to have a deep understanding of this domain. We also seek to promote more ideas and discussions, which facilitates the development of self-supervised learning recommendation techniques.

Date: Thursday, July 7, 11:00 - 14:30, Online (Half Day)


Beyond Opinion Mining: Summarizing Opinions of Customer Reviews

Organizers: Reinald Kim Amplayo, Arthur Bražinskas, Yoshihiko Suhara, Xiaolan Wang and Bing Liu

Abstract:

Customer reviews are vital for making purchasing decisions in the Information Age. Such reviews can be automatically summarized to provide the user with an overview of opinions. In this tutorial, we present various aspects of opinion summarization that are useful for researchers and practitioners. First, we will introduce the task and major challenges. Then, we will present existing opinion summarization solutions, both pre-neural and neural. We will discuss how summarizers can be trained in the unsupervised, few-shot, and supervised regimes. Each regime has roots in different machine learning methods, such as auto-encoding, controllable text generation, and variational inference. Finally, we will discuss resources and evaluation methods and conclude with the future directions. This three-hour tutorial will provide a comprehensive overview over major advances in opinion summarization. The listeners will be well-equipped with the knowledge that is both useful for research and practical applications.

Date: Thursday, July 7, 11:00 - 14:30, Online (Half Day)


Deep Knowledge Graph Representation Learning for Completion, Alignment, and Question Answering

Organizers: Soumen Chakrabarti

Abstract:

A knowledge graph (KG) has nodes and edges representing entities and relations. KGs are central to search and question answering (QA), yet research on deep/neural representation of KGs, as well as deep QA, have moved largely to AI, ML and NLP communities. The goal of this tutorial is to give IR researchers a thorough update on the best practices of neural KG representation and inference from AI, ML and NLP communities, and then explore how KG representation research in the IR community can be better driven by the needs of search, passage retrieval, and QA. In this tutorial, we will study the most widely-used public KGs, important properties of their relations, types and entities, best-practice deep representations of KG elements and how they support or cannot support such properties, loss formulations and learning methods for KG completion and inference, the representation of time in temporal KGs, alignment across multiple KGs, possibly in different languages, and the use and benefits of deep KG representations in QA applications.

Website: https://sites.google.com/view/knowledge-graph-tutorial/home

Date: Thursday, July 7, 11:00 - 14:30, Online (Half Day)


Retrieval and Recommendation Systems at the Crossroads of Artificial Intelligence, Ethics, and Regulation

Organizers: Markus Schedl, Emilia Gomez and Elisabeth Lex

Abstract:

This tutorial aims at providing its audience an interdisciplinary overview about the topics of fairness and non-discrimination, diversity, and transparency of AI systems, tailored to the research fields of information retrieval and recommender systems. By means of this tutorial, we would like to equip the mostly technical audience of SIGIR with the necessary understanding of the ethical implications of their research and development on the one hand, and of recent political and legal regulations that address the aforementioned challenges on the other hand.

Date: Monday, July 11, 13:30 - 17:00, On site (Half Day)

Room: «Sala María Zambrano», Floor: 5