Cloze tests, or fill-in-the-blank multiple-choice exercises, are tests that use sentences either directly extracted from the text or constructed based on the text content, where the tested words are replaced with a gap. The reader is typically presented with a list of alternatives, with one correct answer and a number of distractors, to choose from. For example, a cloze test assessing the use and command of prepositions may include the following question (from Lee & Seneff, 2007):
If you don't have anything planned for this evening, let's go __ a movie.
with the following options to choose from:
(a) to
(b) of
(c) on
(d) null
Cloze tests have been showed to be effective means of assessing one's reading comprehension. In the past, such tests have been actively used to assess the level of language proficiency of non-native readers, but nowadays they can also be used to assess machine comprehension (Hermann et al., 2015).
In the example above, all options have been generated automatically using different approaches to the distractor selection. Option (a) is the correct preposition in this case; option (b) is based on the fact that the two prepositions have comparable frequency and therefore might be confused by the non-native readers; distractor (c) is generated using collocation similarity between the two prepositions; finally, generation of distractor (d) relies on confusion probability in non-native data. This project will look into automated cloze test generation for a wide variety of lexical items, and in particular for open class words (verbs, nouns, adjectives and adverbs). As a starting point, the following methods may be applied to this task:
A successful cloze test generation system should be able to identify the fragments of text to be tested, and generate the distractors plausible enough to be picked by the readers, while producing incorrect or ungrammatical sentences if actually picked. The project will address these challenges.
Dataset of cloze tests used in language exams will be provided.
Hermann et al. (2015), Teaching Machines to Read and Comprehend
Lee & Seneff (2007). Automatic Generation of Cloze Items for Prepositions
Zweig & Burges (2011). The Microsoft Research Sentence Completion Challenge
A comprehensive reading list summarising the past research on cloze test and distractor generation can be found here.