Paper by Eliza Hobo, Charlotte Pouw, Lisa Beinborn

An inclusive society needs to facilitate access to information for all of its members, including citizens with low literacy and with non-native language skills. We present an approach to assess Dutch text complexity on the sentence level and conduct an interpretability analysis to explore the link between neural models and linguistic complexity features. Building on these findings, we develop the first contextual lexical simplification model for Dutch and publish a pilot dataset for evaluation. We go beyond previous work which primarily targeted lexical substitution and propose strategies for adjusting the model’s linguistic register to generate simpler candidates. Our results indicate that continual pre-training and multi-task learning with conceptually related tasks are promising directions for ensuring the simplicity of the generated substitutions.

 

Eliza Hobo, Charlotte Pouw, and Lisa Beinborn. 2023. “Geen makkie”: Interpretable Classification and Simplification of Dutch Text Complexity. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 503–517, Toronto, Canada. Association for Computational Linguistics.

Downloads