Readability metrics have a long history, including the Gunning fog index (1952), SMOG (1969), Flesch-Kincaid (1975) and Coleman–Liau index (1975), as well as modern alternatives like the Lexile Text Measure1 or ATOS2, and new machine-learning and NLP-based approaches. Such metrics can form the basis of a a readability model which could classify text into CEFR levels. A paper by François & Miltsakaki (2012)3 already describes this kind of experiment for French and includes a review of previous related work for English.
Possible choices for training data include: 1) existing texts from textbooks, and 2) learners’ successful writing productions.
A successful metric could be useful for assessing the suitability of texts for exams and textbooks, warning about the difficulty of text on webpages, assessing newly written scripts (e.g., for self-assessment), etc.
The aim is to infer patterns from error-annotated corpora that would enable reliable detection and correction of various errors in written text, including ones which have not previously been seen, generalising beyond simple word n-grams.
One way of discovering latent patterns would be to train a tree substitution grammar (TSG) over syntactic trees or grammatical relations as described in Swanson (2013)1, with more details on training TSGs found in Cohn & Blunsom (2010)2.
Possible extensions include the use of native corpora (containing unannotated correct text) to reinforce or complement the knowledge extracted from the error-corrected data, as well as the use of graph kernels.