All SIGIR workshops occur on Thursday, July 23, and will run the full day. They will be held at various locations on the campus of Northeastern University, approximately 0.9 miles (1.4 kilometers) from the conference hotel. You should allot approximately 20 minutes to reach the workshop registration site from the Sheraton, by foot or train. (For those of you staying in Northeastern University Housing, you can reach the workshop registration site by foot in 3 minutes or less.) In addition to transit time, you should allot 15 to 20 minutes to register, obtain your workshop materials, and proceed to your workshop location.
Workshop registration will take place outside the Raytheon Amphitheater, at the west end of Egan Hall, adjacent to Forsyth Street. Egan Hall is Building 60 on the Northeastern University main campus maps. Upon entering the building, check in at the registration desk to obtain your workshop materials and your exact workshop location. Signage and volunteers will help guide you to your workshop location, no more than a 5 minute walk from the registration site.
Starting and ending times vary across workshosp and are listed after the workshop title. Note the starting time in particular, as some workshops start as early as 8:30am.
The workshops are listed below, along with a brief description of their content. Follow the provided links for details about the individual workshops, including calls for papers, agendas, and so on. Questions about a workshop should be addressed to the organizers.
W1, The Future of IR Evaluation (9:00-5:00)
Evaluation is at the core of information retrieval: virtually all progress owes directly or indirectly to test collections built within the so-called Cranfield paradigm. However, in recent years, IR researchers are routinely pursuing tasks outside the traditional paradigm, by taking a broader view on tasks, users, and context. There is a fast moving evolution in content from traditional static text to diverse forms of dynamic, collaborative, and multilingual information sources. Also industry is embracing "operational" evaluation based on the analysis of endless streams of queries and clicks. The workshop brings together i) those with novel evaluation needs, such as a PhD candidate pursuing a new IR-related problem, and ii) senior IR evaluation experts. Desired outcomes are insight into how to make IR evaluation more "realistic," and at least one concrete idea for a retrieval track or task (at CLEF, INEX, NTCIR, TREC) that would not have happened otherwise. [Workshop web page]
Shlomo Geva, INEX & QUT, Australia
Jaap Kamps, INEX & University of Amsterdam, The Netherlands
Carol Peters, CLEF & ISTI-CNR, Italy
Tetsuya Sakai, NTCIR & Microsoft Research Asia, China
Andrew Trotman, INEX & University of Otago, New Zealand
Ellen Voorhees, TREC/TAC & NIST, USA
W2, Information Access in a Multilingual World: Transitioning from Research to Real-World Applications (9:00-5:00)
This workshop is intended to collate experiences and plans for the real-world application of multilingual technology to information access. The application of multi-lingual search, summarisation, filtering, monitoring, and other technologies is only now starting to become real. The workshop will explore realistic success predictors for such systems, including what sort of outcome variables can be studied, what sort of contextual factors should be taken into account, and what use cases and usage scenarios are envisioned? Multilinguality can mean different things. We invite participants from research projects to discuss their use cases and envisioned application scenarios, and from practical industrial projects to discuss their experiences in deploying technology; we expect to hear reports of discussions with user and stakeholder organisations and to see designs for habitable interaction with data in multiple languages. [Workshop web page]
Fredric Gey UC Berkeley, USA
Noriko Kando, NII, Japan
Jussi Karlgren, SICS, Sweden
W3, Information Retrieval and Advertising (9:00-5:30)
While computational advertising is still a relatively young research field, its significance is enormous as it provides the primary business model behind most of today's Web experience. Online advertising systems employ many IR techniques alongside approaches developed in statistical modeling and machine learning, large-scale data processing, optimization, microeconomics, and human-computer interaction. The purpose of this workshop is to bring together researchers from the different areas relevant to online advertising, strengthen collaborations between industry and academia, and provide a forum for discussion and presentation of late-breaking research. [Workshop web page]
Misha Bilenko, Microsoft Research, USA
Evgeniy Gabrilovich, Yahoo! Research, USA
Matthew Richardson, Microsoft Research, USA
Yi Zhang, University of California at Santa Cruz, USA
W4, Large-Scale Distributed Systems for Information Retrieval, LSDS-IR 2009 (9:00-5:30)
The Web is continuously growing. Currently, there are more than 20 billions pages (some sources suggest 100 billions), compared to less than 1 billion documents in 1998. Traditionally, Web-scale search engines employ large and highly replicated systems, operating on computer clusters in one or few data centers. Coping with the increasing number of user requests and indexable pages requires adding more resources. However, data centers cannot grow indefinitely. Scalability problems in information retrieval have to be addressed in the near future, and new distributed applications are likely to drive the way in which people use the Web. Distributed IR is the point in which these two directions converge. This workshop will provide space for researchers to discuss these problems and to define new directions for the work on distributed information retrieval. [Workshop web page]
Claudio Lucchese, ISTI-CNR, Italy
Gleb Skobeltsyn, Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland
Wai Gen Yee, Illinois Institute of Technology, USA
W5, Learning to Rank for Information Retrieval (8:30-5:30 -- note early starting time)
As an interdisciplinary field between information retrieval and machine learning, learning to rank is concerned with automatically constructing a ranking model using training data. Learning to rank technologies have been successfully applied to many tasks in information retrieval such as search and summarization, and have been attracting more and more attention recently in the information retrieval and machine learning communities. At SIGIR 2007 and SIGIR 2008, we have successfully organized two workshops on learning to rank for information retrieval with very good attendance. We, therefore, organize a workshop on the theme again, in conjunction with SIGIR 2009, dedicated to presentations and discussions about learning to rank for IR. [Workshop web page]
Hang Li, Microsoft Research Asia, China
Tie-Yan Liu, Microsoft Research Asia, China
ChengXiang Zhai, University of Illinois at Urbana-Champaign, USA
W6, Redundancy, Diversity, and Interdependent Document Relevance (8:45-5:00 -- note early starting time)
This workshop will explore how ranking, performance assessment, and learning to rank can move beyond the assumption that the relevance of documents in a ranking is independent. The focus will be on three key themes: the effect of redundancy on information retrieval utility (e.g. how to measure redundancy, avoid it in evaluation and when training ranking algorithms); the role of diversity (e.g. for mitigating the risk of misinterpreting ambiguous queries); and approaches to set-level evaluation and optimization (e.g. how to create reusable set-level judgments that trade off between evaluating relevance relative to the needs of a specific user versus the needs of a distribution of users). We seek original papers addressing one or more of these key themes. [Workshop web page]
Paul N. Bennett, Microsoft Research, USA
Ben Carterette, University of Delaware, USA
Thorsten Joachims, Cornell University, USA
Filip Radlinski, Microsoft Research , USA
W7, Search in Social Media, SSM 2009 (8:30-5:00 -- note early starting time)
Social applications are the fastest growing segment of the web. While there has been progress on searching particular kinds of social media, such as blogs, search in others (facebook/myspace/flickr) are not as well understood. The purpose of this workshop is to focus the attention of the research community on this emerging topic, and to bring together information retrieval and social media researchers to consider the following questions: How should we search in social media? What are the needs of users, and models of those needs, specific to social media search? What models make the most sense? How does search interact with existing uses of social media? What works and what doesn't? [Workshop web page]
Eugene Agichtein, Emory University, USA
Marti Hearst, UC Berkeley, USA
Ian Soboroff, NIST , USA
W8, Understanding the user - Logging and interpreting user interactions in information search and retrieval (9:00-5:15)
Modern information search systems can benefit greatly from using additional information about the user and the user's behavior. Feedback data based on direct interaction (e.g., clicks, scrolling, etc.) as well as on general user profiles/preferences has been proven valuable for personalizing the retrieval process. New technology has made it inexpensive and easy to collect more feedback data and more different types of data (e.g., gaze, emotional, or biometric data). The workshop focuses on discussing and identifying most promising research directions with respect to logging, interpreting, integrating, and using feedback data. Ultimately, it will be aimed at arranging a commonly shared collection of user interaction logging tools for various purposes and based on a variety of feedback data sources. The workshop brings together researchers from IR as well as from human-computer interaction. [Workshop web page]
Nicholas J. Belkin, Rutgers University, USA
Ralf Bierig, Rutgers University, USA
Georg Buscher, Deutsches Forschungszentrum für Künstliche Intelligenz, Germany
Ludger van Elst, Deutsches Forschungszentrum für Künstliche Intelligenz, Germany
Jacek Gwizdka, Rutgers University, USA
Joemon Jose, Glasgow University, Scotland
Jaime Teevan, Microsoft Research, USA