Industry Days

Note: This year we are expanding the SIRIP to be two days instead of one to accommodate even more of the work by the fantastic researchers in the industry.

Industry Days Schedule

All talks will be held in the Rackham Auditorium (Location on Campus Map).

Monday, July 9
10:30 AM – 11:20 AM	Keynote 1	Charu Argawal	IBM Research	Extracting Real-Time Insights from Graphs and Social Streams
11:20 AM – 11:50 AM	Invited Talk 1	Suju Rajan	Criteo	Search on Retail Sites
11:50 AM – 12:10 PM	Talk 1	Ajeet Grewal and Jimmy Lin	Twitter	The Evolution of Content Analysis for Personalized Recommendations at Twitter
12:10 PM – 1:30 PM	Lunch (not provided)
1:30 PM – 2:20 PM	Keynote 2	Xiansheng Hua	Alibaba	The City Brain – Towards Real-Time Search for the Real-World
2:20 PM – 2:40 PM	Talk 2	Konstantine Arkoudas and Mohamed Yahya	Bloomberg	Auto-completion for Question Answering Systems at Bloomberg
2:40 PM – 3:00 PM	Talk 3	Puneet Agrawal and Manoj Kumar Chinnakotla	Microsoft	Lessons from Building a Large-scale Commercial IR-based Chatbot for an Emerging Market
3:00 PM – 3:30 PM	Coffee Break (League Concourse, Floor 2)
3:30 PM – 3:50 PM	Talk 4	David Carmel, Liane Lewin-Eytan and Yoelle Maarek	Amazon	Product Question Answering Using Customer Generated Content – Research Challenges
3:50 PM – 4:10 PM	Talk 5	Ted Yuan and Zezhong Zhang	Ebay Inc.	Merchandise Recommendation For Retail Events With Word Embedding Weighted Tf-idf And Dynamic Query Expansion
4:10 PM – 4:30 PM	Talk 6	Ganesh Venkataraman, James Wong and Brendan Collins	Airbnb Inc.	Large Scale Search Engine Marketing (SEM) at Airbnb
4:30 PM – 5:00 PM	Invited Talk 2	Rosie Jones	Spotify	Searching for Hipster Barbecue Music

Tuesday, July 10
10:30 AM – 11:20 AM	Keynote 3	Jieping Ye	Didi	AI for Transportation
11:20 AM – 11:50 AM	Invited Talk 3	Emre Kiciman	Microsoft	Causal Inference over Longitudinal Data to Support Expectation Exploration
11:50 AM – 12:10 PM	Talk 7	Inho Kang	Naver Corporation	Clova: Services and Devices powered by AI
12:10 PM – 1:30 PM	Lunch (not provided)
1:30 PM – 2:20 PM	Keynote 4	Rajeev Rastogi	Amazon	Machine Learning @ Amazon
2:20 PM – 2:40 PM	Talk 8	Perry Samson and Charles Bassam	Echo360 Inc.	LessonWare: Mining Student Notes to Provide Personalized Feedback
2:40 PM – 3:00 PM	Talk 9	Sahin Geyik, Qi Guo, Bo Hu, Cagri Ozcaglar, Ketan Thakkar, Ryan Wu and Krishnaram Kenthapadi	Linkedin	Talent Search and Recommendation Systems at LinkedIn: Practical Challenges and Lessons Learned
3:00 PM – 3:30 PM	Coffee Break (League Concourse, Floor 2)
3:30 PM – 4:30 PM	Panel

Industry Keynote 1

Dr. Charu C. Aggarwal
Extracting Real-Time Insights from Graphs and Social Streams

Many streaming applications in social networks, communication networks, and information networks are built on top of large graphs Such large networks contain continuously occurring processes, which lead to streams of edge interactions and posts. For example, the messages sent by participants on Facebook to one another can be viewed as content-rich interactions along edges. Such edge-centric streams are referred to as graph streams or social streams. The aggregate volume of these interactions can scale up super-linearly with the number of nodes in the network, which makes the problem more pressing for rapidly growing networks. These continuous streams may be mined for useful insights. In these cases, real-time analysis is crucial because of the time-sensitive nature of the interactions. . However, generalizing conventional mining applications to such graphs turns out to be a challenge because of the expensive nature of graph mining algorithms. We discuss recent advances in several graph mining applications like clustering, classification, link prediction, event detection, and anomaly detection in real-time graph streams.

Charu C. Aggarwal is a Distinguished Research Staff Member (DRSM) at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his undergraduate degree in Computer Science from the Indian Institute of Technology at Kanpur in 1993 and his Ph.D. from the Massachusetts Institute of Technology in 1996. He has worked extensively in the field of data mining. He has published more than 350 papers in refereed conferences and journals and authored over 80 patents. He is the author or editor of 18 books, including textbooks on data mining, recommender systems, machine learning (for text), and outlier analysis. Because of the commercial value of his patents, he has thrice been designated a Master Inventor at IBM. He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of two IBM Outstanding Technical Achievement Awards (2009, 2015) for his work on data streams/high-dimensional data. He received the EDBT 2014 Test of Time Award for his work on condensation-based privacy-preserving data mining. He is also a recipient of the IEEE ICDM Research Contributions Award (2015), which is one of the two highest awards for influential research contributions in the field of data mining. He has served as the general co-chair of the IEEE Big Data Conference (2014) and as the program co-chair of the ACM CIKM Conference (2015), the IEEE ICDM Conference (2015), and the ACM KDD Conference (2016). He serves as the editor-in- chief of the ACM Transactions on Knowledge Discovery from Data as well as the ACM SIGKDD Explorations. He is a fellow of the SIAM, ACM, and the IEEE, for “contributions to knowledge discovery and data mining algorithms”.

Industry Keynote 2

Dr. Xiansheng Hua
The City Brain – Towards Real-Time Search for the Real-World

A city is an aggregate of a huge amount of heterogeneous data. However, extracting meaningful values from that data remains challenging. City Brain is an end-to- end system whose goal is to glean irreplaceable values from big city data, specifically from videos, with the assistance of rapidly evolving AI technologies and fast-growing computing capacity. From cognition to optimization, to decision-making, from search to prediction and ultimately, to intervention, City Brain improves the way we manage the city, as well as the way we live in it. In this talk, firstly we will introduce current practices of the City Brain platform in a few cities in China, including what we can do to achieve the goal and make it a reality. Then we will focus on visual search technologies and applications that we can apply on the city data. Last, a few video demos will be shown, followed by highlighting a few future directions of city computing.

Dr. Xian-Sheng Hua is now a Distinguished Engineer/VP of Alibaba Group, leading the visual computing team in DAMO Academy, working on large-scale visual intelligence on the cloud, including the City Brain project. Dr. Hua is an IEEE Fellow and ACM Distinguished Scientist. He received his B.S. degree in 1996, and the Ph.D. degree in applied mathematics in 2001, both from Peking University, Beijing, China. He joined Microsoft Research Asia, Beijing, China, in 2001, as a Researcher. He was a Principal Research and Development Lead in Multimedia Search for the Microsoft search engine, Bing, Redmond, WA, USA, from 2011 to 2013. He was a Senior Researcher with Microsoft Research Redmond, Redmond, WA, USA, from 2013 to 2015.

He has authored or coauthored more than 200 research papers and has filed more than 90 patents. His research interests include big multimedia data search, advertising, understanding, and mining, as well as pattern recognition and machine learning. Dr. Hua served or is now serving as an Associate Editor for the IEEE Trans. on Multimedia and ACM Transactions on Intelligent Systems and Technology. He served as a Program Co-Chair for IEEE ICME 2013, ACM Multimedia 2012, and IEEE ICME 2012. He was one of the recipients of the 2008 MIT Technology Review TR35 Young Innovator Award for his outstanding contributions on video search. He was the recipient of the Best Paper Awards at ACM Multimedia 2007, and Best Paper Award of the IEEE Trans. on CSVT in 2014.Dr. Hua will be serving as general co-chair of ACM Multimedia 2020.

Industry Keynote 3

Dr. Jieping Ye
AI for Transportation

Didi Chuxing is the largest ride-sharing platform in China, providing transportation services for over 400 million users. Every day, Didi Chuxing’s platform generates over 70 TB worth of data, processes more than 40 billion routing requests, and produces over 15 billion location points. In this talk, I will explain how Didi Chuxing applies big data and AI technologies to analyze big transportation data and improve the travel experience for millions of users.

Dr. Jieping Ye is head of Didi AI Labs, a VP of Didi Chuxing and a Didi Fellow. He is also an associate professor of University of Michigan, Ann Arbor. His research interests include big data, machine learning, and data mining with applications in transportation and biomedicine. He has served as a Senior Program Committee/Area Chair/Program Committee Vice Chair of many conferences including NIPS, ICML, KDD, IJCAI, ICDM, and SDM. He serves as an Associate Editor of Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He won the NSF CAREER Award in 2010. His papers have been selected for the outstanding student paper at ICML in 2004, the KDD best research paper runner up in 2013, and the KDD best student paper award in 2014.

Industry Keynote 4

Dr. Rajeev Rastogi
Machine Learning @ Amazon

In this talk, I will first provide an overview of key problem areas where we are applying Machine Learning (ML) techniques within Amazon such as product demand forecasting, product search, and information extraction from reviews, and associated technical challenges. I will then talk about three specific applications where we use a variety of methods to learn semantically rich representations of data: question answering where we use deep learning techniques, product size recommendations where we use probabilistic models, and fake reviews detection where we use tensor factorization algorithms.

Rajeev Rastogi is a Director of Machine Learning at Amazon where he is developing ML platforms and applications for the e-commerce domain. Previously, he was Vice President of Yahoo! Labs Bangalore and the founding Director of the Bell Labs Research Center in Bangalore, India. Rajeev is an ACM Fellow and a Bell Labs Fellow. He is active in the fields of databases, data mining, and networking, and has served on the program committees of several conferences in these areas. He currently serves on the editorial board of the CACM, and has been an Associate editor for IEEE Transactions on Knowledge and Data Engineering in the past. He has published over 125 papers, and holds over 50 patents. Rajeev received his B. Tech degree from IIT Bombay, and a PhD degree in Computer Science from the University of Texas, Austin.

Industry Invited Talk 1

Dr. Suju Rajan
Search on Retail Sites

Criteo powers the sponsored product module for a number of retailers. In this talk, we will highlight some of the interesting aspects of search queries on retail sites and how certain aspects of the query become important. We will also give an initial preview of the work done at Criteo Research to improve the performance of retrieval for sponsored products

Suju Rajan is the VP, Head of Research at Criteo. At Criteo, her team works on all aspects of performance driven computational advertising, including, real-time bidding, large-scale recommendation systems, auction theory, reinforcement learning, online experimentation, metrics and scalable optimization methods. Prior to Criteo, she was the Director of the Personalization Sciences at Yahoo Research where her team worked on personalized recommendations for several Yahoo products.

Industry Invited Talk 2

Dr. Rosie Jones
Searching for Hipster Barbecue Music

Spotify is an audio streaming platform, connecting music creators and podcasters to a huge international audience streaming audio over large numbers of genres, contexts and moods.

Spotify recommends audio content to our users, and our users request audio with voice and text queries. Sometimes a band or song name is hard to pronounce, or hard to remember, and sometimes users have a need which is simultaneously both vague and specific, such as “hipster barbecue music”.

In this talk I will go over the broad space of research problems at Spotify in audio understanding and recommendation, as well as drilling down into particular language, speech and document understanding problems, which highlight interesting new research areas.

Dr. Rosie Jones is Director of Research at Spotify, where she leads a team of research scientists working on search and language technologies for music and podcasts. Before that, she worked at Microsoft, in the New England Research and Development Center (NERD), on conversational understanding. She has also led a team conducting research on large-scale machine learned models for display advertising, at Akamai and MediaMath. Dr Jones also spent 8 years as a scientist at Yahoo! Labs, working on search and sponsored search. Her research interests include online user behavior, conversational understanding systems, web search and natural language processing. Dr. Jones has a PhD in Language and Information Technologies from Carnegie Mellon University and is a Senior Member of the ACM.

Industry Invited Talk 3

Dr. Emre Kiciman
Causal Inference over Longitudinal Data to Support Expectation Exploration

Many people use web search engines for expectation exploration: exploring what might happen if they take some action, or how they should expect some situation to evolve. While search engines have databases to provide structured answers to many questions, there is no database about the outcomes of actions or the evolution of situations. The information we need to answer such questions, however, is already being recorded. On social media, for example, hundreds of millions of people are publicly reporting about the actions they take and the situations they are in, and an increasing range of events and activities experienced in their lives over time. In this presentation, we show how causal inference methods can be applied to such individual-level, longitudinal records to generate answers for expectation exploration queries.

Emre Kıcıman is a principal researcher at Microsoft Research AI, where he works at the intersection of social computing, machine learning, and information retrieval. His research focus is on causal analysis of large-scale datasets, as well as the broader implications of AI on people and society. Emre’s past research includes entity linking methods, deployed in the Bing search engine; and foundational work on applying machine learning to fault management in large-scale internet services.