Probing structural constraints of negation in Pretrained Language Models

International audience Contradictory results about the encoding of the semantic impact of negation in pretrained language models (PLMs). have been drawn recently (e.g. Kassner and Schütze (2020); Gubelmann and Handschuh (2022)). In this paper we focus rather on the way PLMs encode negation and its f...

Full description

Bibliographic Details
Main Authors: Kletz, David, Candito, Marie, Amsili, Pascal
Other Authors: Lattice - Langues, Textes, Traitements informatiques, Cognition - UMR 8094 (Lattice), Université Sorbonne Nouvelle - Paris 3-Université Sorbonne Paris Cité (USPC)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Sciences et Lettres (PSL)-Département Littératures et langage - ENS Paris (LILA), École normale supérieure - Paris (ENS-PSL), Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-École normale supérieure - Paris (ENS-PSL), Université Paris Sciences et Lettres (PSL), Université Sorbonne Nouvelle - Paris 3, Laboratoire de Linguistique Formelle (LLF - UMR7110), Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Université Paris Cité (UPCité), Northern European Association for Language Technology (NEALT), Tanel Alumäe, Mark Fishel, ANR-10-LABX-0083,EFL,Empirical Foundations of Linguistics : data, methods, models(2010)
Format: Conference Object
Language:English
Published: HAL CCSD 2023
Subjects:
NLP
Online Access:https://hal.science/hal-04641828
https://hal.science/hal-04641828/document
https://hal.science/hal-04641828/file/CR_Probing_structural_constraints_of_negation_in_PLMs_NoDaLiDA_long.pdf
Description
Summary:International audience Contradictory results about the encoding of the semantic impact of negation in pretrained language models (PLMs). have been drawn recently (e.g. Kassner and Schütze (2020); Gubelmann and Handschuh (2022)). In this paper we focus rather on the way PLMs encode negation and its formal impact, through the phenomenon of the Negative Polarity Item (NPI) licensing in English. More precisely, we use probes to identify which contextual representations best encode 1) the presence of negation in a sentence, and 2) the polarity of a neighboring masked polarity item. We find that contextual representations of tokens inside the negation scope do allow for (i) a better prediction of the presence of not compared to those outside the scope and (ii) a better prediction of the right polarity of a masked polarity item licensed by not, although the magnitude of the difference varies from PLM to PLM. Importantly, in both cases the trend holds even when controlling for distance to not. This tends to indicate that the embeddings of these models do reflect the notion of negation scope, and do encode the impact of negation on NPI licensing. Yet, further control experiments reveal that the presence of other lexical items is also better captured when using the contextual representation of a token within the same syntactic clause than outside from it, suggesting that PLMs simply capture the more general notion of syntactic clause.