Article

MSc Thesis - VU - Contextualized Lexical Simplification for Accessibility of Dutch Texts - Eliza Hobo

MSc Thesis by Eliza Hobo

Many Europeans experience difficulties with reading. This can cause them to be unable to take in important information that is meant for them. For governments and institutions, it is thus important to write their texts in a way that makes them most accessible. The words that are used in these texts play an essential role in their perceived complexity. Automatic simplification of these complex words is called lexical simplification and can help increase the accessibility of texts.

Previous work on lexical simplification has relied mostly on lexical resources. The main limitations of such approaches are 1) that they cannot generate all simplifications for all complex words, and 2) that the simplifications are solely based on the complex word, not the sentence around it. This can result in unsuited simplifications. Therefore, a model for contextualized lexical simplification was introduced by Qiang et al. (2020), which does not rely on lexical resources and generates simplifications based on the context. It makes use of the contextual language model BERT to generate substitutions of the complex word. This approach achieves state-of-the-art results on common benchmarking datasets.

In this work, experiments are performed to increase the performance of this method by fine-tuning the model toward simple language generation, this way the model will produce simplifications rather than substitutions. Moreover, the possibility of fine-tuning to adapt the model to the domain of Dutch municipal texts is explored. To do so, first, a dutch variant of LSBert, called LSBertje is developed. The model is then fine-tuned to produce domain-specific simplifications. The findings of this work underline the adaptability of these contextual language models: exposure to simple language is seen to yield simpler simplification candidates. Domain-specific simplifications were not achieved, but the findings suggest that fine-tuning for domain-specific simplifications is a feasible research angle for future work.

 

This research was conducted by Eliza Hobo in collaboration with AI Team, Urban Innovation and R&D, City of Amsterdam.

Involved civil servants: Iva Gornishka & Cláudia Pinhão.

Supervisors: Iva Gornishka & Lisa Beinborn

Additional info

Media

Documents