![]() |
||||||||||||||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||||||||||||||
|
A wide variety of information access and data mining tasks can be viewed as text classification problems. This perspective allows machine learning techniques to be used to reduce manual effort. Attendees of this tutorial will learn what machine learning can and can't do, how to choose learning techniques and software, and processes and techniques for improving their effectiveness. Examples will be drawn from areas such as knowledge management, customer service, web directories, alerting and news services, filtering, bioinformatics, information security, and survey research. I will end by discussing areas where research progress could greatly aid text classification in operational settings. David D. Lewis is a consultant on information retrieval, data mining, and natural language processing. He works with organizations of all sizes on the design, implementation, acquisition, and operation of systems for manipulating and mining text data. Lewis has published more than forty scientific papers and holds six patents on information retrieval and text mining technology. He helped organize several U.S. government evaluations of language processing technologies, and created several widely used test collections. Prior to setting up a consulting practice, he held positions at AT&T Labs, Bell Labs, and the University of Chicago. |