Update details for recently updated papers (#247)

* Update 2023-09-19-2309.10954.md Venue updated * Update 2022-11-29-2211.16031.md Venue updated * Rename 2022-11-29-2211.16031.md to 2023-05-22-2211.16031.md change year * Update and rename 2022-12-18-retriever-lm-reasoning.md to 2023-05-07-retriever-lm-reasoning.md Venue updated and year changed * Update 2023-02-02-2302.00871.md Venue updated * Update 2023-10-18-MAGNIFICo.md Thumbnail path added * Add Magnifico image
McGill-NLP · Oct 19, 2023 · a218b1d · a218b1d
1 parent 17cf9ca
commit a218b1d
Show file tree

Hide file tree

Showing 6 changed files with 178 additions and 10 deletions.
diff --git a/_posts/papers/2023-02-02-2302.00871.md b/_posts/papers/2023-02-02-2302.00871.md
@@ -1,10 +1,10 @@
 ---
 title: Using In-Context Learning to Improve Dialogue Safety
-venue: ArXiv
+venue: EMNLP Findings
 names: Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin,
   Siva Reddy, Yang Liu, Dilek Z. Hakkani-Tür
 tags:
-- ArXiv
+- EMNLP Findings
 link: https://arxiv.org/abs/2302.00871
 author: Nicholas Meade
 categories: Publications

diff --git a/...pers/2022-12-18-retriever-lm-reasoning.md → ...pers/2023-05-07-retriever-lm-reasoning.md b/...pers/2022-12-18-retriever-lm-reasoning.md → ...pers/2023-05-07-retriever-lm-reasoning.md
@@ -3,10 +3,10 @@ title: Can Retriever-Augmented Language Models Reason? The Blame Game Between th
   Retriever and the Language Model
 author: Parishad BehnamGhader
 names: Parishad BehnamGhader, Santiago Miret, Siva Reddy
-venue: ArXiv
+venue: EMNLP Findings
 link: https://arxiv.org/abs/2212.09146
 tags:
-- ArXiv
+- EMNLP Findings
 code: https://github.com/McGill-NLP/retriever-lm-reasoning
 thumbnail: /assets/images/papers/retriever-lm-reasoning.jpg
 categories: Publications
@@ -21,4 +21,4 @@ categories: Publications
 
 ## Abstract
 
-The emergence of large pretrained models has enabled language models to achieve superior performance in common NLP tasks, including language modeling and question answering, compared to previous static word representation methods. Augmenting these models with a retriever to retrieve the related text and documents as supporting information has shown promise in effectively solving NLP problems in a more interpretable way given that the additional knowledge is injected explicitly rather than being captured in the models' parameters. In spite of the recent progress, our analysis on retriever-augmented language models shows that this class of language models still lack reasoning over the retrieved documents. In this paper, we study the strengths and weaknesses of different retriever-augmented language models such as REALM, kNN-LM, FiD, ATLAS, and Flan-T5 in reasoning over the selected documents in different tasks. In particular, we analyze the reasoning failures of each of these models and study how the models' failures in reasoning are rooted in the retriever module as well as the language model.
+The emergence of large pretrained models has enabled language models to achieve superior performance in common NLP tasks, including language modeling and question answering, compared to previous static word representation methods. Augmenting these models with a retriever to retrieve the related text and documents as supporting information has shown promise in effectively solving NLP problems in a more interpretable way given that the additional knowledge is injected explicitly rather than being captured in the models' parameters. In spite of the recent progress, our analysis on retriever-augmented language models shows that this class of language models still lack reasoning over the retrieved documents. In this paper, we study the strengths and weaknesses of different retriever-augmented language models such as REALM, kNN-LM, FiD, ATLAS, and Flan-T5 in reasoning over the selected documents in different tasks. In particular, we analyze the reasoning failures of each of these models and study how the models' failures in reasoning are rooted in the retriever module as well as the language model.
diff --git a/_posts/papers/2022-11-29-2211.16031.md → _posts/papers/2023-05-22-2211.16031.md b/_posts/papers/2022-11-29-2211.16031.md → _posts/papers/2023-05-22-2211.16031.md
@@ -1,9 +1,9 @@
 ---
 title: Syntactic Substitutability as Unsupervised Dependency Syntax
-venue: ArXiv
+venue: EMNLP
 names: Jasper Jian, Siva Reddy
 tags:
-- ArXiv
+- EMNLP
 link: https://arxiv.org/abs/2211.16031
 author: Jasper Jian
 categories: Publications

diff --git a/_posts/papers/2023-09-19-2309.10954.md b/_posts/papers/2023-09-19-2309.10954.md
@@ -1,6 +1,6 @@
 ---
 title: In-Context Learning for Text Classification with Many Labels
-venue: arXiv.org
+venue: GenBench workshop @ EMNLP
 names: Aristides Milios, Siva Reddy, Dzmitry Bahdanau
 tags:
 - arXiv.org
@@ -18,4 +18,4 @@ categories: Publications
 
 ## Abstract
 
-In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works.
+In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works.
diff --git a/_posts/papers/2023-10-18-MAGNIFICo.md b/_posts/papers/2023-10-18-MAGNIFICo.md
@@ -12,6 +12,7 @@ tags:
 - Evaluation
 code: https://github.com/McGill-NLP/MAGNIFICo
 webpage: https://mcgill-nlp.github.io/MAGNIFICo
+thumbnail: /assets/images/papers/magnifico.svg
 categories: Publications
 
 ---
@@ -24,4 +25,4 @@ categories: Publications
 
 ## Abstract
 
-Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. Therefore, it is crucial for LLMs to learn novel interpretations in-context. In this paper, we systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning. To facilitate our study, we introduce MAGNIFICo, an evaluation suite implemented within a text-to-SQL semantic parsing framework that incorporates diverse tokens and prompt settings to simulate real-world complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a surprisingly robust capacity for comprehending novel interpretations from natural language descriptions as well as from discussions within long conversations. Nevertheless, our findings also highlight the need for further improvements, particularly when interpreting unfamiliar words or when composing multiple novel interpretations simultaneously in the same example. Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts.
+Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. Therefore, it is crucial for LLMs to learn novel interpretations in-context. In this paper, we systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning. To facilitate our study, we introduce MAGNIFICo, an evaluation suite implemented within a text-to-SQL semantic parsing framework that incorporates diverse tokens and prompt settings to simulate real-world complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a surprisingly robust capacity for comprehending novel interpretations from natural language descriptions as well as from discussions within long conversations. Nevertheless, our findings also highlight the need for further improvements, particularly when interpreting unfamiliar words or when composing multiple novel interpretations simultaneously in the same example. Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts.
diff --git a/assets/images/papers/magnifico.svg b/assets/images/papers/magnifico.svg