SIGIR 2014 Workshops

SIGIR'14 will host seven attractive workshops covering novel ideas and emerging areas in IR.

Please look at the individual websites for the calls, and deadlines — and participate in the discussion on the SIGIR'14 workshop day, on Friday 11 July 2014, in the beautiful scenery of Gold Coast, Queensland, Australia.




ERD'14: Entity Recognition and Disambiguation Challenge

The Entity Recognition and Disambiguation Workshop will be organized as a challenge, where participants submit working systems that identify the entities mentioned in text. The challenge will have two tracks, focusing on long and short texts. All submissions will be evaluated on shared datasets; part of the data will be withheld, to be used for the final evaluation of all submitted systems to determine the winners. Each participating team will be offered a spot at the workshop to present their system.


David Carmel is a Principal Research Scientist at Yahoo! lab at Haifa. David's research is focused on search and content quality analysis in community question answering sites, query performance prediction, vertical search, and text mining. David has published more than 80 papers in IR and Web journals and conferences, and serves on the editorial board of the IR journal and as a senior PC member or an Area Chair of many conferences (SIGIR, WWW, WSDM. CIKM). He organized a number of workshops and taught several tutorials at SIGIR, and WWW. David earned his PhD in Computer Science from the Technion, Israel Institute of Technology in 1997.

Ming-Wei Chang is a researcher at Microsoft Research. His research interests are in machine learning and natural language understanding. He currently focuses on using large-scale structured and unstructured data for semantic understanding. Specially, he is interested in developing algorithms for entity linking that are effective for short and noisy text

Evgeniy Gabrilovich is a senior staff research scientist at Google, where he works on knowledge discovery from the web. Prior to joining Google in 2012, he was a director of research and head of the natural language processing and information retrieval group at Yahoo! Research. Evgeniy is an ACM Distinguished Scientist (2012), and is a recipient of the 2010 Karen Sparck Jones Award for his contributions to natural language processing and information retrieval.

Bo-June (Paul) Hsu is a researcher in the Internet Services Research Center at Microsoft Research. Since graduating from MIT in 2009, he has been collaborating with various researchers and engineers across Microsoft to apply research methodologies to solve real-world product challenges. In particular, he has contributed various data structures and algorithms to Bing's query auto-completion system, enabling it to compute the top completions with spelling correction or input method conversion from among billions of queries.

Kuansan Wang is a Principal Researcher and manager of the Internet Service Research Center (ISRC) at Microsoft Research (MSR), Redmond. He joined MSR Speech Technology Group in 1998, conducting research in the areas of speech recognition, spoken language understanding and multimodal dialog. From 2004 to 2007, he was a software architect at speech product and business incubation groups, helping create and commercialize a wide range of award winning speech products for Microsoft. Since 2007, he has been with MSR ISRC conducting research on web search and machine learning. Dr. Wang is an active member in both academic and industrial communities. He has published more than 50 peered review articles and 140 patents. He is also the author of 6 ISO and 3 W3C standards in the area of speech processing and voice communications.




GEAR'14: Gathering Efficient Assessments of Relevance

Evaluation is a fundamental part of Information Retrieval, and in the conventional Cranfield evaluation paradigm, sets of relevance assessments are a fundamental part of test collections. In this workshop, we wish to revisit how relevance assessments can be efficiently created. Potential themes include methods for generating assessments, the process of assessment, effort involved in assessing different materials, exploration of the concept of relevance etc. A discussion and exploration of this issue will be facilitated through the presentation of results based papers and position papers on the topic, as well as a group design activity.


Martin Halvey BSc (UCD, Ireland) PhD (UCD, Ireland) is a Lecturer (Assistant Professor) in the School of Engineering and Built Environment at Glasgow Caledonian University. His research focuses on user-centric issues in IR and novel interaction. He has a number of publications at high quality venues including ACM CHI, ACM SIGIR, ECIR, ACM Multimedia and IPM. Of these papers 3 of them have received best paper nominations. He has previously organized workshops at ACM Multimedia and for the Scottish Informatics and Computer Science Alliance (SICSA). He is the local organizing chair for ACM ICMR 2014 and one of the Lab Chairs for CLEF 2014. He is the current holder of a grant from the UK Arts and Humanities Research Council (AHRC) entitled "Understanding the annotation process: annotation for big data" (AH/L010364/1).

Robert Villa BSc (Strathclyde), MSc (Heriot-Watt), PhD (Glasgow) is a Lecturer (Assistant Professor) in the Information School (formerly the Department of Information Studies) in the faculty of Social Sciences University of Sheffield, UK. He is the Local Organiser for CLEF 2014. His research interests are primarily in the development and evaluation of search interfaces for text, image and video retrieval. Most of my past research has involved the creation and evaluation of novel search interfaces, including supporting exploratory search, collaborative search, and supporting artists and animators using content-based image and video retrieval. He is the current holder of a grant from the UK Arts and Humanities Research Council (AHRC) entitled "Understanding the annotation process: annotation for big data" (AH/L010364/1).

Paul D. Clough, BEng (York) PhD (Sheffield) PGCertHE (Sheffield) is a Senior Lecturer in the Information School (formerly the Department of Information Studies) in the faculty of Social Sciences University of Sheffield, UK. He is head of the Information Retrieval (IR) Group. He is currently Scientific Director of an EU-funded project called PATHS (Personalised Access To cultural Heritage Spaces), running an AHRC-funded project on recommender systems for WorldCat.org with OCLC Inc. and I am co-organiser of the TREC 2013 Session Track. He has also recently been awarded a Google Faculty Research Award for a project about developing a taxonomy of search sessions. His research interests include: information storage and retrieval, particularly multilingual searching of texts and images; evaluation of retrieval systems; natural language processing, text reuse and plagiarism detection.



MedIR'14: Medical Information Retrieval

Medical information is accessible from diverse sources including the general web, social media, journal articles, and hospital records; users include patients and their families, researchers, practitioners and clinicians. Challenges in medical information retrieval include: diversity of users and user ability; variations in the format, reliability, and quality of biomedical and medical information; the multimedia nature of data; and the need for accuracy and reliability. The aim of the workshop is to bring together researchers interested in medical information search with the goal of identifying specific challenges that need to be addressed to advance the state-of-the-art.


Lorraine Goeuriot is a post-doctoral researcher in medical information processing and retrieval in Dublin City University. She obtained her Master in computer science and PhD in computational linguistics on medical data in the University of Nantes, France. She also worked as a post-doctoral researcher in Nanyang Technological University, Singapore, on medical opinion mining. She is co-chair of the CLEF eHealth 2014 evaluation lab, and has been co-leading the information retrieval task in 2013 and 2014. She was publication co-chair for SIGIR 2013 and COLING 2014. She has been involved in two national French research projects and currently in EU project Khresmoi, on medical information access. She is reviewing papers for several medical informatics workshops and journals.

Gareth Jones is a Faculty member and Principal Investigator in the CNGL Centre for Global Intelligent Content, School of Computing, Dublin City University. He holds BEng and PhD degrees from the University of Bristol. He previously held posts at the University of Cambridge and University of Exeter, and was a Toshiba Fellow with the Toshiba Corporation in Japan. He has research interests in diverse topics areas of information retrieval, natural language and multimedia technologies, and human computer interaction. He has been PI on a number of funded projects including the Khresmoi project on medical information management. He was General co-Chair of ACM SIGIR 2013, Programme co-Chair of ECIR 2011 and IR Track chair of ACM CIKM 2010. He has published more than 300 papers describing his research several of which have received Best Paper Awards.

Liadh Kelly is a post-doctoral researcher in medical information retrieval at Dublin City University, Ireland. She completed a PhD in the lifelog retrieval space, also at Dublin City University, and holds MSc (Research) and BSc degrees in Computer Science from University College Dublin, Ireland. She is co-chair of the CLEF 2014 eHealth evaluation lab on medical information visualisation, extraction and retrieval, and co-lead of the retrieval task. She was publication co-chair for SIGIR 2013 and COLING 2014, has co-chaired several international symposiums and workshops, and regularly reviews for top international journals and conferences.

Henning Muller studied medical informatics at the University of Heidelberg, Germany, and then worked at Daimler-Benz research in Portland, OR, USA. From 1998-2002 he worked on his PhD degree at the University of Geneva, Switzerland with a research stay at Monash University, Melbourne, Australia. Since 2002 Henning has been working for the medical informatics service at the University hospitals of Geneva. Since 2007 he has been full professor at the HES-SO Valais and since 2011 he is responsible for the eHealth unit. Henning is coordinator of the Khresmoi project; initiator of the ImageCLEF benchmark has authored over 400 scientific papers and is in the editorial board of several journals.

Justin Zobel is Head of the University of Melbourne's Department of Computing & Information Systems. He received his PhD from the University of Melbourne and for many years was based at RMIT University, where he led the Search Engine group. His current research areas includes search, bioinformatics, and compression. He is an author of three texts on postgraduate study and research methods. He is an associate editor of the Information Retrieval Journal, ACM Transactions on Information Systems, and Information Processing & Management.



PIR'14: Privacy-Preserving IR: When Information Retrieval Meets Privacy and Security

Information retrieval and information privacy/security are two fast-growing computer science disciplines. There are many synergies and connections between these two disciplines. However, there have been very limited efforts to connect the two. On the other hand, due to lack of mature techniques in privacy-preserving IR, concerns about privacy and security have become serious obstacles that prevent valuable user data to be used in IR research such as studies about query logs, social media, tweets, sessions, and medical record retrieval. This privacy-preserving IR workshop aims to spurring research brings together the research fields of IR and privacy/security, and mitigate privacy threats in information retrieval by exploring novel algorithms and tools.


Luo Si is an Associate Professor at Purdue University. His research interests include information retrieval, machine learning techniques and applications and information privacy/security with more than 100 publications. In particular, he has recently worked on applications of detection and anonymization of sensitive text information, privacy-preserving information search and malicious software detection with information retrieval and machine learning techniques. Si is an associate editor of ACM Transactions on Information System (TOIS), ACM Transactions on Interactive Information Systems (TIIS) and an editorial board member of Information Processing and Management (IPM). He has been an Area Chair for ACM SIGIR, WWW, WSDM and CIKM. Si was a workshop co-chair for SIGIR 2011.

Grace Hui Yang is an Assistant Professor in the Department of Computer Science at Georgetown University. Grace's current research interests include privacy-preserving information retrieval, particularly online information exposure detection that predicts the vulnerability for common Web activities and warns users about the sensitivity of their activities before they make innocent moves. Her research interests also include session search, evaluation, information organization and ontology construction. Prior to this, Grace has worked on question answering, near-duplicate detection, multimedia information retrieval, and opinion and sentiment detection. The results of her research have been published in SIGIR, CIKM, ACL, TREC, ECIR, and WWW since 2002. Grace co-chaired the SIGIR 2013 and SIGIR 2014 Doctoral Consortium and serves as a program committee member in SIGIR, ACL, EMNLP, CIKM, WSDM, and KDD



SMIR'14: Semantic Matching in Information Retrieval

Recently, significant progress has been made in research on what we call semantic matching (SM), in Web search, question answering, online advertisement, cross language information retrieval, multimedia retrieval, and other tasks. Let us take Web search as example of the problem. When comparing the textual content of query and documents, the simple term-based approaches can fail when searcher and author use different terms. A more realistic approach beyond bag-of-words, referred to as semantic matching (SM), is to conduct deeper query and document analysis to encode text with richer representations and then perform query-document matching with such representations. The main purpose of the workshop is to bring together IR and NLP researchers working on or interested in semantic matching, to share latest research results, express opinions on the related issues, and discuss future directions.


Julio Gonzalo is head of nlp.uned.es, the UNED research group in Natural Language Processing and Information Retrieval. His research interests include Evaluation Methodologies and Metrics in Information Retrieval and Natural Language Processing, Cross-Language and Interactive Information Retrieval, Search Results Organization, Entity-Oriented and Semantic Search.

Hang Li is chief scientist of the Noah's Ark Lab at Huawei. His research areas include information retrieval, natural language processing, statistical machine learning, and data mining.

Alessandro Moschitti is a Senior Research Scientist at the Qatar Computing Research Institute (QCRI) and a tenured professor at the CS Department of the University of Trento, Italy. He has significant expertise in both theoretical and applied machine learning for NLP, IR and Data Mining.

Jun Xu is a researcher at Noah's Ark Lab, Huawei, Hong Kong. His research interest focuses on machine learning for web search and text mining.



SoMeRA'14: International Workshop on Social Media Retrieval and Analysis

The SoMeRA 2014 workshop will present and discuss cutting edge research on all topics of retrieval, recommendation, and browsing in social media, as well as on the analysis of user's multifaceted traces in social media. In particular, novel methods and ideas that address challenges such as large quantity and noisiness of user-generated multimedia data, user biases, cold-start problem, or integrating contextual aspects into retrieval and recommendation techniques are highly welcome. The workshop will further foster the exchange of ideas between different communities, in particular it aims at better connecting the multimedia and recommender systems communities with the information retrieval community. The workshop will feature both oral presentations (full papers) and poster/demo presentations (short papers).


Markus Schedl is an assistant professor at the Department of Computational Perception at the Johannes Kepler University (JKU) Linz. He graduated in Computer Science from the Vienna University of Technology and earned his Ph.D. from the JKU. He further holds a Master's degree in International Business Administration from the Vienna University of Economics and Business Administration. Markus (co-)authored more than 80 refereed conference papers and journal articles (for instance, published in ACM SIGIR, ECIR, Transactions on Information Systems, and Information Retrieval. Furthermore, he serves on various program committees and reviewed submissions to several conferences and journals (among others, ACM Multimedia, ECIR, IJCAI, ICASSP, IEEE Visualization; IEEE Transactions of Multimedia, Data & Knowledge Engineering, ACM Transactions on Intelligent Systems and Technology, Multimedia Systems, Information Retrieval). His main research interests include social media mining, multimedia information retrieval, and intelligent/personalized user interfaces.

Peter Knees is an assistant professor at the Department of Computational Perception at the Johannes Kepler University Linz. He holds a Master's degree in Computer Science from the Vienna University of Technology and a Ph.D. degree from the Johannes Kepler University Linz. Since 2004, he co-authored over 50 peer-reviewed conference and journal publications and serves as program committee member and reviewer for several conferences and journals relevant to the fields of music, multimedia, and text IR. Since 2009, he organizes the International Workshop on Advances in Music Information Research series. In 2010, Peter served as program chair of the 8th International Workshop on Adaptive Multimedia Retrieval. In addition to music and Web information retrieval, his research interests include multimedia, user interfaces, recommender systems, and digital media arts.

Jialie Shen is an assistant professor in Information Systems, School of Information Systems, Singapore Management University, Singapore. Dr. Shen's main research interests include information retrieval, economic-aware media analysis, and statistical machine learning. His recent work has been published or is forthcoming in leading journals and international conferences including ACM SIGIR, ACM Multimedia, ACM SIGMOD, CVPR, ICDE, WWW, IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), IEEE Transactions on Multimedia (IEEE TMM), ACM Multimedia Systems Journal, ACM Transactions on Internet Technology (ACM TOIT) and ACM Transactions on Information Systems (ACM TOIS).



TAIA'14: Temporal, social and spatially Aware Information Access

Users provide an unprecedented volume of detailed, and continuously updated information about where they are, what they are doing, who they are with, and what they are thinking and feeling about their activities. The provision of this stream creates an informal contract between the user and the information access application in which the user will provide the information, but the application must provide results that are contextually relevant. In this workshop we explore spatial and temporal context in dynamic geotagged collections, such as Wikipedia, and traditional news sources, as well as social media sites such as Twitter, Foursquare, Facebook and Flickr. To ground the workshop, and provide a locus for discussion of the two aspects of user context, we focus on event detection and recommendation. Events are a natural theme around which to center discussions of spatial and temporal context because events are defined by their time and place.


Fernando Diaz is a researcher at Microsoft Research New York. His primary research interest is formal information retrieval models. Fernando's research experience includes distributed information retrieval approaches to web search, interaction logging and modeling, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and synthesizing information from multiple corpora. Fernando received his PhD from the University of Massachusetts Amherst in 2008. His work on federation won the best paper awards at the WSDM 2009, SIGIR 2009, and ECIR 2011 conferences. His work on crisis informatics has received awards at SIGIR 2011 and ISCRAM 2013. He is a co-organizer of the Temporal Summarization track and Web track at TREC 2013.

Claudia Hauff is an Assistant Professor in the Web Information Systems group at Delft University of Technology, the Netherlands. Her research focuses on retrieval in social media settings, in particular the exploitation of spatial knowledge. Other areas of interest are the effects of retrieval on corpora covering long periods of time and the prediction of query effectiveness. She was part of the organizing committee of SIGIR and WSDM and is currently the editor of the SIGIR mailing list. She received her PhD from the University of Twente, the Netherlands.

Vanessa Murdock is a Principal Applied Researcher for Bing, based in the Seattle area. Before joining Microsoft, she was a Senior Research Scientist at Yahoo! Research. She received her PhD from the University of Massachusetts. Her primary area of interest is local search, personalization, and geographic modeling of social media, to understand the way that users describe and interact with places. Other areas of interest include retrieval models for short text such as ads, social media, and tagged images. She has served on the organizing committees for ECIR, SIGIR, and WSDM.

Maarten de Rijke is Professor of Computer Science at the University of Amsterdam. He has been director of the Intelligent Systems Lab Amsterdam since 2008, with around 75 researchers working on text and image search and analysis and machine learning. Prior to joining the University of Amsterdam he worked at Warwick University and at CWI Amsterdam. Maarten has published close to 600 papers on knowledge representation and information retrieval. His current research focuses on online learning to rank, semantic search, social media analysis, and predictive analytics. He is general chair for ECIR 2014 and co-chair for CIKM 2015.

Milad Shokouhi is an Senior Applied Researcher working for Bing at Microsoft Research Cambridge. He is also an honorary lecturer in School of Computing Science at the University of Glasgow. Before joining Microsoft, he did his PhD on federated search at RMIT University in 2007. Milad has been working on analyzing and classifying time-sensitive queries for Bing for the past 5-6 years. His other research interests include auto-completion, personalization, federated search and query reformulation. He has served on the program committee of most major IR conferences and journals.