Methods for Mining and Summarizing Text Conversations

Bio | Summary

Bio:

Dr. Giuseppe Carenini is an associate professor in computer science at the University of British Columbia, UBC (BC, Canada), with broad interdisciplinary interests. His work on combining natural language processing and information visualization to support decision making has been published in over 70 peer-reviewed papers. Dr. Carenini was the area chair for “Sentiment Analysis, Opinion Mining, and Text Classification” of ACL 2009 and the area chair for “Summarization and Generation” of NAACL 2012. He has recently co-edited an ACM-TIST Special Issue on “Intelligent Visual Interfaces for Text Analysis”. In July 2011, he has published a co-authored book on “Methods for Mining and Summarizing Text Conversations”. In his work, Dr. Carenini has also extensively collaborated with industrial partners, including Microsoft and IBM.

Dr. Gabriel Murray has just recently joined as an assistant professor the dept. of CS of the University of the Fraser Valley, UFV, (BC, Canada). His background is in natural language processing as well as theoretical linguistics. Dr. Murray has an established research record in the area of automatic summarization, with particular attention to summarization of noisy genres such as speech and web data, and comparison of abstractive and extractive techniques. He did his graduate studies at the University of Edinburgh under Dr. Steve Renals, was a member of the EU-funded AMI project on studying multimodal interaction.
Before joining UFV, Dr. Murray was a researcher at UBC with the NSERC Business Intelligence Network on intelligent data management and decision making. While at UBC, he also gained substantial teaching experience. In July 2011, Dr. Murray has published a co-authored book on “Methods for Mining and Summarizing Text Conversations”.

Summary:

People attending this tutorial will learn about a set of computational methods to extract information from conversational data, and to provide natural language summaries of the data. The tutorial will start with an overview of basic concepts by examining a simple written conversation.
We will clarify fundamental differences; for instance, between topic segmentation and topic labelling and between extractive and abstractive summaries.

After this, students will learn about metrics for evaluating the effectiveness of summarization and various extraction tasks. They will also become familiar with some of the benchmark corpora used in the literature.

In the second part of the tutorial student will learn about extraction and mining methods for performing subjectivity and sentiment detection, topic segmentation and modeling, and the extraction of conversational structure. Our focus will be on clarifying how methods developed for generic text can be extended to work on conversational data, such as meeting transcripts (which exemplify synchronous conversations) and emails (which exemplify asynchronous conversations). Very recent approaches to deal with blogs, discussion forums and microblogs
(e.g.,Twitter) will be also discussed by interactively exploring several examples.

In the third part of the tutorial, students will learn about natural language summarization of conversational data. We will initially provide a critical overview of several extractive and abstractive summarizers developed for emails, meetings, blogs and forums. Then, we will describe our own recent attempts for building multi-modal summarizers.

At the end of the tutorial we will engage students in a discussion on the future of the research on mining and summarizing conversations with a special focus on how this research can be informed by and be beneficial to research and applications in information retrieval.

This tutorial should be suitable for researchers who have a background in Computer Science, Information Science or Linguistics, but only minimal exposure to Natural Language Processing. We assume the audience to be at least somewhat familiar with basic probability and basic machine learning.