Hola! I'm Cristina and I'm a young research engineer at the University of Poitiers and a PhD student at the University of Corsica, under the supervision of Stella Retali-Medori (University of Corsica) and co-supervision of Marianne Vergez-Couret (University of Poitiers).
My thesis focuses on low-resource languages and linguistic variation in the area of Natural Language Processing. In particular, for Corsican and Poitevin-Saintongeais, two regional languages of France. It is part of the ANR DIVITAL project, a projet aiming to provide linguistic resources and support with language technologies to several regional languages of France, including Alsatian and Occitan.
Earlier, I worked at Lattice (Paris), where I focused on syntactic parsing and lexicons for Old French within the Profiterole project. I joined the project during a master internship at the ATILF laboratory (Nancy) under the supervision of Mathieu Constant (University of Lorraine) and Alexey Lavrentiev (ENS Lyon). During that time, my work focused on the challenges of automatic lemmatization for an historical non-standardized language, which is described in the following article.
My research interests are primarily focused on low-resource and nonstandard languages, both from historical and contemporary perspectives. I am interested in other areas, such as automatic text simplification (the main subject of my master's thesis), as well as misinformation and political polarization.
- Empowering Low-Resource Regional Languages with Lexicons : A Comparative Study of NLP Tools for Morphosyntactic Analysis (Holgado et al., 2024)
- More than just data : Dialectal variation and NLP resources for Corsican and Poitevin-Saintongeais (Holgado, 2023)
- Can LLMs be used to understand clinical notes better ? (Sinha et at., 2023)
- Are Machine Learning Algorithms better for Author Profiling? (Holgado et al., IberLEF 2022)
- IAI @ SocialDisNER : Catch me if you can! Capturing complex disease mentions in tweets (Sinha et al., SMM4H 2022)
- Évaluation de méthodes et d’outils pour la lemmatisation automatique du français médiéval (Evaluation of methods and tools for automatic lemmatization in Old French) (Holgado et al., TALN 2021)
2023/24 - Corpus linguistics, Pragmatics, Tools and data for linguists
2022/23 - Lexicology, Contact linguistics, Tools and data for linguists