Computer Laboratory

Technical reports

Natural language processing for information retrieval

David D. Lewis, Karen Spärck Jones

July 1993, 22 pages

Abstract

The paper summarizes the essential properties of document retrieval and reviews both conventional practice and research findings, the latter suggesting that simple statistical techniques can be effective. It then considers the new opportunities and challenges presented by the ability to search full text directly (rather than e.g. titles and abstracts), and suggests appropriate approaches to doing this, with a focus on the role of natural language processing. The paper also comments on possible connections with data and knowledge retrieval, and concludes by emphasizing the importance of rigorous performance testing.

Full text

PS (0.1 MB)

BibTeX record

@TechReport{UCAM-CL-TR-307,
  author =	 {Lewis, David D. and Sp{\"a}rck Jones, Karen},
  title = 	 {{Natural language processing for information retrieval}},
  year = 	 1993,
  month = 	 jul,
  url = 	 {http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-307.ps.gz},
  institution =  {University of Cambridge, Computer Laboratory},
  number = 	 {UCAM-CL-TR-307}
}