Tutorials
The tutorials below that have strike-through font have been cancelled due to insufficient enrollment. The other tutorials are still open for registration. Those who registered for a tutorial that has been cancelled will be able to receive a full refund or register in another tutorial.
Full Day Sunday, August 12:
-
8:45-17:00
- Beyond Bag-of-Words: Machine Learning for Query-Document Matching in Web Search
- Presenters:
- Hang Li (Microsoft Research), Jun Xu (Microsoft Research)
- Dealing with mismatch between query and document is one of the most critical research problems in web search. Recently researchers have spent significant effort to address the grand challenge. The major approach is to conduct more query and document understanding, and perform matching between enriched query and document representations. With the availability of large amount of log data and advanced machine learning techniques, this becomes more feasible and significant progress has been made recently. In this tutorial, we will give a systematic and detailed survey on newly developed machine learning technologies for query document matching in web search. We will focus on the descriptions on the fundamental problems, as well as the novel solutions. Matching between query and document is not limited to search, and similar problems can be observed at online advertisement, recommendation system, and other applications, as matching between objects from two spaces.
- More Information
Morning Sunday, August 12:
- 8:45-12:15
- Crowdsourcing for Search Evaluation and Social-Algorithmic Search
- Presenters:
- Matthew Lease (University of Texas at Austin), Omar Alonso (Microsoft (Bing))
- Internet-based access to 24/7 online human crowds is driving a renaissance of research in Human Computation and the advent of Crowdsourcing. On one hand, labeled data for system training and evaluation can be collected faster, cheaper, and easier than ever before. Moreover, because human capabilities still exceed purely automated approaches for AI-hard tasks such as interpreting text or images, we can now design hybrid systems which integrate human computation with automated algorithms. Such hybrid systems let us explore a richer design space for navigating traditional tradeoffs between processing time, cost, effort, or accuracy.
- More Information
- (Big) Usage Data in Web Search
- Presenters:
- Ricardo Baeza-Yates (Yahoo! Labs), Yoelle Maarek (Yahoo! Labs)
- This half-day tutorial reviews and explains how usage data such as query logs or click data revolutionized Web search at all levels. We will demonstrate the multiple benefits of usage data through multiple examples but also consider its limitations. More specifically, we will discuss three factors that often pull in opposite directions when dealing with usage data: the size of the data, personalization needs and privacy concerns. We will conclude by offering some possible ways to circumvent these limitations.
- More Information
- A New Look at Old Tricks: The Fertile Roots of Current Research
- Presenter:
- Paul Kantor (Rutgers University)
- Many of the ideas that surface as &ldqup;new&rdqup; in today’s super-heated research environment have very firm roots in earlier developments in fields as diverse as citation analysis and pattern recognition. The purpose of this tutorial is to survey those roots, and their relation to the contemporary fruits on the tree of information retrieval, and to separate, as much as is possible in an era of increasing secrecy about methods, the problems to be solved, the algorithms for solving them, and the heuristics that are the bread and butter of a working operation.
- More Information
- Aspect-based Opinion Mining from Product Reviews
- Presenters:
- Samaneh Moghaddam (Simon Fraser University), Martin Ester (Simon Fraser University)
- "What other people think" has always been an important piece of information for most of us during the decision-making process. Today people tend to make their opinions available to other people via the Internet. As a result, the Web has become an excellent source of consumer opinions. However, it is really difficult for a customer to read all of the reviews and make an informed decision on whether to purchase the product. It is also difficult for the manufacturer of the product to keep track and manage customer opinions. Aspect-based opinion mining is a new research direction that addresses this need. In this tutorial we will cover opinion mining in online product reviews with the focus on aspect-based opinion mining. The tutorial will cover not only general opinion mining and retrieval tasks, but also state-of-the-art methods, challenges, applications, and also future research directions of aspect-based opinion mining.
- More Information
- Experimental Methods for Information Retrieval
- Presenters:
- Donald Metzler (Google), Oren Kurland (Technion Israel Institute of Technology)
- Experimental evaluation plays a critical role in driving progress in information retrieval (IR) today. Careful evaluation is necessary for advancing the state-of-the-art; yet, many published papers present work that was ill-evaluated. Furthermore, many papers are not accepted for publication due to non-satisfactory evaluation. Therefore, there is a strong need to educate students, researchers, and practitioners about the proper way to carry out IR experiments. This tutorial focuses on the methodologies that should be used to perform empirical evaluation, specifically, for ad hoc retrieval, of the highest scientific standards.
- More Information
- Methods for Mining and Summarizing Text Conversations
- IR Models: Foundations and Relationships
- Patent Information Retrieval
Afternoon Sunday, August 12:
- 13:30-17:00
- Large-Scale Graph Mining and Learning for Information Retrieval
- Presenters:
- Bin Gao (Microsoft Research), Taifeng Wang (Microsoft Research), Tie-Yan Liu (Microsoft Research)
- For many IR applications, one needs to deal with large-scale graphs such as Web graph, user-page bipartite graph, and social network graph. All these graphs are of very large scale and contain rich information. As a result, it is non-trivial to perform efficient and effective mining and learning on them. On one aspect, we need to design scalable algorithms. On another aspect, we also need to develop powerful computational infrastructure to support these algorithms. This tutorial aims at giving a timely introduction to the related works, and provides the audiences with a comprehensive view on the literature.
- More Information
- Query Performance Prediction for IR
- Presenters:
- David Carmel (Information Retrieval Group, IBM), Oren Kurland (Technion)
- The goal of this tutorial is to expose participants to current research on query performance prediction (also known as query difficulty estimation). Participants will become familiar with state-of-the-art performance prediction methods, with common evaluation methodologies for prediction quality, and with potential applications that can utilize such performance predictors. In addition, some open issues and challenges in the field will be discussed.
- More Information
- Advances on the Development of Evaluation Measures
- Presenters:
- Emine Yilmaz (Microsoft Research), Evangelos Kanoulas (Google), Ben Carterette (University of Delaware)
- Measuring the utility of a search engine to an end user sits at the core of research and development in the field of information retrieval. The goal of this tutorial is to provide attendees with a comprehensive overview of the latest advances in the development of information retrieval evaluation measures and discuss the current challenges in the area. A number of topics will be covered, including models of user interaction, evaluation measures based on user models, nugget-based evaluation, measures for novelty and diversity, and session-based measures.
- More Information
- Collaborative Information Seeking — Art and Science of Achieving 1+1>2 in IR
- Medical Information Retrieval
- Visual Information Retrieval using Java and LIRE
- Information Retrieval for E-Discovery