Part II – Advanced Topics
Donald Metzler (Yahoo!) [Short Bio]
Victor Lavrenko (University of Edinburgh) [Short Bio]
Abstract:
This half-day tutorial will cover advanced topics in probabilistic models for information retrieval. The tutorial will cover dependence assumptions in the classical probabilistic model and the language modeling framework for information retrieval. Various term dependence models will be covered in detail. Other topics include Zhai's risk minimization framework, Lavrenko's generative relevance model, Turtle's inference network retrieval model, Metzler's Markov random field model, Amati's divergence from randomness model, and Gey's logistic regression model. The theory behind each model will be described and the applicability of each will be discussed in the context of various real world applications.
One underlying goal of the tutorial will be to explicitly show the connections that exist between various models. This will not only reinforce attendee's understanding of the various models, but also provide the understanding necessary to develop more robust probabilistic retrieval models for emerging application domains.
Attendees of this tutorial should have a working understanding of basic probabilistic information retrieval concepts and models, including the probability ranking principle, the classical probabilistic model, and the language modeling framework for information retrieval. In addition to slides, hands on exercises and examples of real world applications of the models will be used throughout the tutorial.
Bio:
Donald Metzler
Donald Metzler is a Research Scientist in the Search and Computational Advertising group at Yahoo! Research. He obtained his Ph.D. from the University of Massachusetts. His research interests include formal information retrieval models, web search, advertising, and machine learning. He has published research papers at major information retrieval venues, including SIGIR, CIKM, and WWW, and is a co-author of the book Search Engines: Information Retrieval in Practice. He is currently serving as co-chair of the SIGIR 2009 poster track.
More information can be found at http://research.yahoo.com/Don_Metzler.
Victor Lavrenko
Victor Lavrenko is a Lecturer in Informatics at the University of Edinburgh. He received his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2004, and worked as a language technology consultant for the Credit Suisse Group prior to his appointment at Edinburgh. He has served as a co-chair of a HLT/NAACL 2003 student workshop and gave a tutorial on language modeling techniques at the SIGIR 2003 conference. Victor has published research papers in and has reviewed for the SIGIR, CIKM, NAACL/HLT, KDD and NIPS conferences. His research interests include formal models for searching text in multiple languages, annotating and retrieving images, and detecting and tracking novel events in the news.
Additional information can be found at http://homepages.inf.ed.ac.uk/vlavrenk/.