ACS Project Suggestions

Email me to meet and discuss before submitting your preferences!

(1) Modeling Language Acquisition and Language Change

Inducing grammars from ancient texts

Proposer: Weiwei Sun

Supervisor: Weiwei Sun and Theresa Biberauer

Description: To analyze syntactic structures, linguists rely on native speakers’ grammaticality judgement  —  intuitive judgement of the well-formedness of utterances  —  which reflects speakers’ linguistic competence. Clearly, this approach is not fully applicable to ancient languages, because no matter how professional an language expert/linguist is, he/she cannot be a native speaker of a dead language. We propose to employ data-based computational approaches which, in our opinion, have complementary strengths to theoretical linguistic approaches. In this project, the student will explore computational models to process huge historical data to derive quantitative evidences for linguistic analysis. We hope the work can further help characterise language variations in a precise way.

(2) Algorithms in Surface Realisation / Meaning-to-Text Generation

Neural graph-to-string parsing

Proposer/Supervisor: Weiwei Sun

A fundamental problem in modeling language generation is parsing meaning representations, i.e. computing all possible analyses of a given meaning representation (MR) according to a competence grammar. In the framework of graph-based meaning representations, the problem has been partially solved  —  enumerating all possible analyses of a semantic graph can be achieved in a reasonable time. However, it is still unknown how to equip such a parsing algorithm with a neural module to build a high-performance natural language generation system. In this project, the student will explore recursive graph neural network models to properly select a preferable analysis from symbolic graph parsing results, based on which a natural language generation system can be easily built.

References

Surface realisation from ill-formed meaning representations

... you can get away with incomplete semantics when you are doing parsing, but when you're doing generation, you have to specify everything in semantics. And we don't know how to do that. At least we don't know how to do that completely or properly.  —  Mark Steedman

Proposer/Supervisor: Weiwei Sun

Description: Recent research shows that mapping from compositional meaning representations, like English Resource Semantics, to strings (aka meaning-to-text generation) can be quite accurate. The promising results, however, are based on gold-standard meaning representations, which can't be provided by NLP applications systems. It is still unknown how current algorithms perform based on ill-formed meaning representations. If current algorithms can't handle ill-formed meaning representations well, how can we make them better? In this project, the student will empirically study the computational problems in mapping ill-formed meaning representations to surface string.

References

(3) Educational Dialogue System

Supporting language learners understanding complex sentences in English texts

Proposer: Weiwei Sun

Supervisor: Weiwei Sun, Andrew Caines and Paula Buttery

Description: Long multi-agent, multi-clause sentences can prove challenging for learners of English to understand. Many such sentences occur in professional literature, teaching textbooks and language exams, e.g. IELTS, SAT and GRE. Modern parsers trained on large-scale data can produce high-quality syntactic and semantic analysis for English sentences, even those with very complicated structures. However, how to automatically interpret linguistic representations for pedagogical purpose is still understudied. In this project the student will work on models to transduce from syntactic constituency parses of long sentences to pedagogical analyses of long sentences which may be presented to learners as a way to support sentence interpretation.

Resources: corpus of 1000 English textbook sentences annotated with templates for pedagogical explanation of syntactic structures (already built).