SIRIP 2014 Program

Updated: 02 Jul 2014

The goal of SIRIP is to bring together IR researchers, practitioners, analysts, and consumers and to achieve knowledge transfer across the boundaries. Ideally, everyone goes away with new understanding and at least one new idea to think about or work on. SIRIP is being held on a separate day from the main conference and industry people are able to register just for it.

SIRIP 2014 is a one-day event starting at 0900 on Monday, 07 Jul. The below information is now organized in program order and includes start times for individual talks. Speakers please check your presentation during the break before your session, and make yourself known to the session chair.

[09.00] Welcome and Introduction to SIRIP 2014

Isabelle Moulinier & David Hawking

Chair Session 1: Mark Sanderson, RMIT University

[09.15] “OK Glass…Google…Why Do I Need Your Search?.”

The rise of wearable technology signifies yet another shift in the content consumption and generation habits of consumers and professionals alike. Such devices and solutions have limited user interaction touch points as a result of smaller form factors and/or have relatively few number of gestures available given the limited surface area. Even solutions like Google Glass utilize less than perfect voice functionality to stretch that functionality into capabilities such as navigation and search. In fact, one could argue whether user directed search is even worth creating as a primary use case for these types of devices. Instead, the explosion in wearable technology is the perfect inflection point for the rise of anticipatory computing and zero query search - where in the near future, our preferences, search history, and expectations (taken From our behaviors on PC’s, laptops, mobile phones, tablets, etc.) will be largely known ahead of time and pushed to us on these wearable devices without our need to ask, hence rendering user initiated search completely irrelevant.

Bob Schukai is the Head of Advanced Product Innovation for Thomson Reuters. In addition to overseeing the development and execution of the mobile growth strategy across the organization, his remit also includes the development new capabilities and experiences around data visualization and predictive analytics for desktop, mobile, wearable devices, and other products. He serves as an ambassador to New York City and the east coast of the United States for the London Tech City initiative, driven by the British government and the United Kingdom Trade and Investment group. This ‘extraordinary’ second job, as he calls it, allows him to share the story of innovation in Britain and why he advocates that ambitious global companies should operate here to benefit from the right environment for success. In the Queen’s New Year Honours list in 2014, Schukai was awarded an MBE for outstanding community service. He leads Thomson Reuters’ efforts in their headline sponsorship of the Apps for Good programme; designed to transform the way technology is taught in schools and to grow future leaders in mobile technology development.

[10.00] Chinese Search Engine - Baidu's Practice

For the past decade, Baidu grew up from a startup to the world’s biggest Chinese search engine which serves over 600 million users. In the meanwhile, Baidu witnessed dramatically transform on how people surfing in the internet and interacting with search engine. The essence of Baidu’s success is the know-how on Chinese user’s behavior. A typical Chinese internet user spends about the same amount of time online per week as a typical US internet user does. But when it comes to search engine, Chinese internet users have some very unique characteristics that are different from their global counterparts. To provide the best way for people to find information, Baidu is trying to better understand Chinese users, Chinese queries and Chinese web pages. In this talk, I will share our finding and discuss challenges in Chinese search engine. I will also describe Baidu's practice of developing Chinese search engine.

Dr WANG Haifeng is a vice president of Baidu, the chair of Department of Language Information Engineering of Peking University, and a visiting professor of Harbin Institute of Technology. In 1999, He received his PhD in Computer Science from Harbin Institute of Technology. Soon after, he worked as an associate researcher at Microsoft Research China from 1999 to 2000, a research scientist at (Hong Kong) from 2000 to 2002, and the chief research scientist and deputy director at Toshiba (China) R&D Center till Jan. 2010. Dr. Wang is also the immediate past president of the Association for Computational Linguistics (ACL). He has served as program chair, workshop chair, tutorial chair, area chair, industry chair, and sponsorship chair for several top conferences including SIGIR, ACL, IJCAI, KDD, COLING, IJCNLP etc., as well as associate editor, guest editor and reviewers for some academic journals.

[10.45] Coffee

Chair Session 2: Vanessa Murdoch, Bing USA

[11.15] “Computer says no.”

We live in the information age where individuals have instant access to large volumes of information. This increased accessibility is leading to a significant transformation that is affecting how the public sector operates and in particular, information overload is becoming increasingly common. As a result, decision-support technology is being employed to decipher value from amassed information and to better support decision-makers. Administrative law provides a framework for ensuring that government decisions are lawful, free from bias, rational, effective, efficient, open and fair, and that decision-makers are held accountable. Administrative law must keep up with changing technology but equally, fit-for-purpose technology also needs to be developed for the administrative environment. In particular, the fairness of decisions needs to be considered when building increasingly complex technology for use by administrative decision-makers. For example, a recommendation made by a machine needs to be balanced with the decision-maker’s own interpretation of evidence. A more flexible approach is needed in this dynamic environment and academics and technologists need to better understand the constraints of an administrative environment in order to assist government in maintaining effective decision-making under the rule of law. What is required is improved partnerships between government and the information research and development community to make sure that government information is exploited for the benefit of the public, and that systems are developed in a way that preserves the integrity and fairness of decision-making.

Dr Maria Milosavljevic is the Chief Information Officer at the Australian Crime Commission. Her experience includes senior roles across government, industry and academia with responsibility for delivering innovative solutions. As the Program Manager for the Crime Commission’s Fusion Centre, Dr Milosavljevic delivered new capabilities across the organisation that resulted in significantly improved business practices. This included establishing advanced information exploitation systems and teams including a new analytics unit. Dr Milosavljevic has more than 20 years of experience, has published widely and has created several world-firsts. She completed a PhD in Language Technology with a scholarship from Microsoft Research Institute, and is currently completing the ANZSOG Executive Masters of Public Administration at ANU.

[11.45] Describe, Discover and Deliver — Challenges in making content available in the digital age.

The State Library of New South Wales has recently embarked on an ambitious programme to digitise 52 of its most significant collections and make them available to all. This expansion of collection content available online brings with it many challenges around how we can manage to describe these collections, make them discoverable and to deliver them in ways that make sense to our users. This paper is about the development of a discovery platform for one part of our collections and the lessons we learnt in that process.

Kate Curr is currently Manager, Online Information Services at the State Library of NSW where she is responsible for managing the Library's websites and for the Library Management Systems. She has had a career in special, academic, and Parliamentary libraries before joining the State Library in 2008. She has been involved in Library systems, database design and development and online services since the late 1980s. She has a particular interest in search and discovery for Library collections and is currently involved with the State Library’s Digital Excellence Programme which is a programme of work that will create over 20 million digital objects of the Library’s vast collections over the next 10 years. This programme of work is also replacing the technology infrastructure to store, and make these collections accessible to the world.

[12.15] This ain't your father's search engine

In just a few short years, search has quickly evolved from being a small text box in the nether regions of a website to being front and center in our lives. Increasingly, however, search engine technology is also being used for practical, real time recommendations, events processing, complex spatial functionality and time series analysis capable of not only matching user's queries in text, but also driving real time decision making and analytics. In fact, open source Apache Lucene can do all of this and more by taking advantage of new data structures and algorithms that complement more traditional IR approaches. In this demo-driven talk, Lucene committer Grant Ingersoll will take a look at some of the new and exciting ways users are leveraging Lucene and related technology to drive deeper insight into information needs that go beyond keywords in a text box.

Grant Ingersoll is the CTO and co-founder of LucidWorks as well as an active member of the Lucene community – a Lucene and Solr committer, co-founder of the Apache Mahout machine learning project and a long standing member of the Apache Software Foundation. Grant’s prior experience includes work at the Center for Natural Language Processing at Syracuse University in natural language processing and information retrieval. Grant earned his B.S. from Amherst College in Math and Computer Science and his M.S. in Computer Science from Syracuse University. Grant is also the co-author of “Taming Text” from Manning Publications.

[12.45] Lunch

Chair Session 3: David Harper, Google Europe

[14.00] The Evolution of WTF: Follower Recommendation Services at Twitter

WTF (Who to Follow) is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. In this talk I will discuss the evolution of the WTF service: the first generation architecture depended on a system called Cassovary, an open-source in-memory graph processing engine built from scratch by Twitter specifically for WTF. This approach gave way to a Hadoop-based machine learning framework, which has recently been supplemented by a custom architecture for generating real-time recommendations. I will discuss the tradeoffs between different architectures, provide a general overview of algorithms, and share lessons learned in running a large-scale production service.

Jimmy Lin is an Associate Professor in the College of Information Studies (The iSchool) at the University of Maryland, with a joint appointment in the Institute for Advanced Computer Studies (UMIACS) and an affiliate appointment in the Department of Computer Science. He xsgraduated with a Ph.D. in Electrical Engineering and Computer Science from MIT in 2004. Lin's research lies at the intersection of information retrieval and natural language processing; his current work focuses on large-scale distributed algorithms and infrastructure for data analytics. From 2010-2012, Lin spent an extended sabbatical at Twitter, where he worked on services designed to surface relevant content to users and analytics infrastructure to support data science. He continues to engage with Twitter on various aspects of big data and data science.

[14.30] Refereed Papers

This year we have solicited scientific papers relating to industry applications of IR and will run a session of short papers. This potentially allows for communication of impactful work to an industry audience, dissemination of late-breaking news, interesting work on closed data sets, and scientific evaluation of theories in practice. We hope to have the accepted papers and invited talk abstracts included in the ACM Digital Library. In the interim, you can find them here.

Accepted Papers
[14.30]Jing Bai, Jan Pedersen and Mao YangWeb-Scale Semantic RankingBing, Microsoft
[14.45]Ramik Sadana, Bongwon Suh, Eunyee Koh, and Yekyung KimA Visual Analytics Approach to Summarizing TweetsGeorgia Tech, Seoul National Uni, Adobe
[15.00]Yangjie Yao and Aixin SunProduct Name Recognition and Normalization in Internet ForumsNanyang Tech Uni, Singapore
[15.15]Ahmed Tawfik and Ahmed KamelOn the Interaction between Query Language and Query Domain in Cross-Lingual Web SearchMicrosoft Egypt

[15.30] Coffee

Chair Session 4: Chengxiang Zhai, U. Illinois

[16.00] Bing Dialog — Toward richer interactions with Web search

As the SIGIR community celebrates the 21st birthday of web search, the traditional gateway to adulthood, we are witnessing dramatic changes in how people interact with search engines. A multi-year initiative at Microsoft (called Bing Dialog) aims to support much richer forms of interaction. It aims to match a user’s search intents to the knowledge harvested from the web at the semantic level. Aside from reactively retrieving information and answering questions, Bing Dialog Model includes additional dialog acts, such as confirmation, disambiguation, refinement and digression, that the search engine can execute proactively to expedite the process of getting users with the knowledge they need. Essentially, the search engine becomes a collaborative dialog agent, such as those explored in the AI community but with the scale extended to the entire web. In this talk, we will share our findings based on the deployment data collected at Bing EN-US for over 14 months, and discuss the web-scale engineering challenges, technically unsolved problems in knowledge acquisition, user intent inference, behavioral modeling, interaction management and metric developments.

Kuansan Wang is a Principal Researcher and manager of the Internet Service Research Center (ISRC) at Microsoft Research (MSR), Redmond. He joined MSR Speech Technology Group in 1998, conducting research in the areas of speech recognition, spoken language understanding and multimodal dialog. From 2004 to 2007, he was a software architect at speech product and business incubation groups, helping create and commercialize a wide range of award winning speech products for Microsoft. Since 2007, he has been with MSR ISRC conducting research on web search and machine learning. Dr. Wang is an active member in both academic and industrial communities. He has published more than 50 peered review articles and 140 patents. He is also the author of 6 ISO and 3 W3C standards in the area of speech processing and voice communications.

[16.30] Panel: Billionaire or Bust? Commercializing IR Research

Many academic researchers have been involved in efforts to commercialize their research through start-ups, spin-offs, or licensing. We are lining up a sample of them to discuss topics such as: What prompted them to make this move? (Have you thought of doing it yourself?) How did they choose between investment and organic growth? Were they able to secure funding without risking their house? What are the hurdles, traps and opportunities? What lessons did they learn? Are they rich now?

Stuart Beil will moderate the panel. He is Senior Policy Advisor to the Hon. Ian Walker MP, Qld Minister for Science, Information Technology, Innovation and the Arts. Stuart is an experienced company director and senior executive, having worked in both industry and government. He was previously Chairman and Executive Director of Funnelback Pty Ltd, a web and enterprise search engine company he founded and sold. Stuart was also General Manager, Commercialisation at Australia's premier science agency CSIRO where he was responsible for commercialising CSIRO intellectual property. He has worked in the financial markets, including at the Sydney Futures Exchange.

Confirmed Panelists
Michael CameronAustralia Michael Cameron is the co-founder of Rome2rio, based in Melbourne, Australia. Rome2rio is organising the world's transport information and offers a multi-modal, door-to-door travel search engine. It returns itineraries for air, train, coach, ferry, mass transit and driving options to and from any location. Michael has a PhD from RMIT University and worked for three years as a senior engineer on Microsoft's Bing search engine.
Arjen de VriesNetherlands
Spinque - Search by
Arjen P. de Vries leads the Information Access research group at the Centrum Wiskunde & Informatica (CWI) in Amsterdam. He also holds a part-time full professor position at Delft University of Technology. In November 2009, he co-founded CWI spin-off company Spinque to satisfy his interest into the integration of information retrieval and databases. Spinque develops novel search solutions based on "Search by Strategy", an iterative 2-stage search process that separates search strategy definition (the how) from actual searching and browsing the collection (the what). This way, information specialists can reclaim their expertise in a time dominated by a "do-it-yourself" attitude to search. The technology builds on research in information retrieval (probabilistic relational algebra) and database architecture (column-stores), to turn the engineering of tailored search engines into a simple, flexible and efficient process.
David LewisUSA
David D. Lewis Consulting
Dave Lewis, Ph.D. is a consulting computer scientist and expert witness working in the areas of information retrieval, data mining, natural language processing, and the evaluation of complex information systems. He formerly held research positions at AT&T Labs, Bell Labs, and the University of Chicago. He was the co-founder of Ornarose, Inc., a data mining software company, and has served as a consultant or advisor to a number of start-up companies. He is a Fellow of the American Association for the Advancement of Science.
Tetsuya SakaiJapan
Tetsuya Sakai
Researcher at the Toshiba R&D Center (2000-2001 Postdoc visiting researcher at the University of Cambridge)
Director, Natual Language Lab at NewsWatch, Inc. (a Japan-US joint venture later acquired by Yahoo! Japan)
Lead researcher, Microsoft Research Asia
Associate Professor, Department of Computer Science and Engineering, Waseda University

[Approx. 17.30]Introduction to Next Year's Chairs & Feedback on SIRIP 2014

Jaime Teevan and Hang Li are the chairs for next year. This is their opportunity to introduce themselves and to hear any thoughts you have on the format of this year's track. For example, did you like:

  • Holding it on a separate day to the main conference and encouraging participation by local industry
  • Including presentations from IR consumers / practitioners rather than just researchers and technologists
  • Including a refereed papers section
  • Including a panel

Of course you're very welcome to mail comments and suggestions to the email alias below and we'll pass them on.

SIRIP 2014 Co-Chairs

Isabelle Moulinier, Thomson Reuters

David Hawking, Microsoft (Bing) and Australian National University

