From a218b1df0d1a8418d70a9dac94279dbe23d7090d Mon Sep 17 00:00:00 2001 From: Arkil Patel <39628727+arkilpatel@users.noreply.github.com> Date: Thu, 19 Oct 2023 11:25:17 -0700 Subject: [PATCH] Update details for recently updated papers (#247) * Update 2023-09-19-2309.10954.md Venue updated * Update 2022-11-29-2211.16031.md Venue updated * Rename 2022-11-29-2211.16031.md to 2023-05-22-2211.16031.md change year * Update and rename 2022-12-18-retriever-lm-reasoning.md to 2023-05-07-retriever-lm-reasoning.md Venue updated and year changed * Update 2023-02-02-2302.00871.md Venue updated * Update 2023-10-18-MAGNIFICo.md Thumbnail path added * Add Magnifico image --- _posts/papers/2023-02-02-2302.00871.md | 4 +- ...d => 2023-05-07-retriever-lm-reasoning.md} | 6 +- ...2211.16031.md => 2023-05-22-2211.16031.md} | 4 +- _posts/papers/2023-09-19-2309.10954.md | 4 +- _posts/papers/2023-10-18-MAGNIFICo.md | 3 +- assets/images/papers/magnifico.svg | 167 ++++++++++++++++++ 6 files changed, 178 insertions(+), 10 deletions(-) rename _posts/papers/{2022-12-18-retriever-lm-reasoning.md => 2023-05-07-retriever-lm-reasoning.md} (96%) rename _posts/papers/{2022-11-29-2211.16031.md => 2023-05-22-2211.16031.md} (98%) create mode 100644 assets/images/papers/magnifico.svg diff --git a/_posts/papers/2023-02-02-2302.00871.md b/_posts/papers/2023-02-02-2302.00871.md index e0e017d7..c2037482 100644 --- a/_posts/papers/2023-02-02-2302.00871.md +++ b/_posts/papers/2023-02-02-2302.00871.md @@ -1,10 +1,10 @@ --- title: Using In-Context Learning to Improve Dialogue Safety -venue: ArXiv +venue: EMNLP Findings names: Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin, Siva Reddy, Yang Liu, Dilek Z. Hakkani-Tür tags: -- ArXiv +- EMNLP Findings link: https://arxiv.org/abs/2302.00871 author: Nicholas Meade categories: Publications diff --git a/_posts/papers/2022-12-18-retriever-lm-reasoning.md b/_posts/papers/2023-05-07-retriever-lm-reasoning.md similarity index 96% rename from _posts/papers/2022-12-18-retriever-lm-reasoning.md rename to _posts/papers/2023-05-07-retriever-lm-reasoning.md index 3f3bba11..79f50ec3 100644 --- a/_posts/papers/2022-12-18-retriever-lm-reasoning.md +++ b/_posts/papers/2023-05-07-retriever-lm-reasoning.md @@ -3,10 +3,10 @@ title: Can Retriever-Augmented Language Models Reason? The Blame Game Between th Retriever and the Language Model author: Parishad BehnamGhader names: Parishad BehnamGhader, Santiago Miret, Siva Reddy -venue: ArXiv +venue: EMNLP Findings link: https://arxiv.org/abs/2212.09146 tags: -- ArXiv +- EMNLP Findings code: https://github.com/McGill-NLP/retriever-lm-reasoning thumbnail: /assets/images/papers/retriever-lm-reasoning.jpg categories: Publications @@ -21,4 +21,4 @@ categories: Publications ## Abstract -The emergence of large pretrained models has enabled language models to achieve superior performance in common NLP tasks, including language modeling and question answering, compared to previous static word representation methods. Augmenting these models with a retriever to retrieve the related text and documents as supporting information has shown promise in effectively solving NLP problems in a more interpretable way given that the additional knowledge is injected explicitly rather than being captured in the models' parameters. In spite of the recent progress, our analysis on retriever-augmented language models shows that this class of language models still lack reasoning over the retrieved documents. In this paper, we study the strengths and weaknesses of different retriever-augmented language models such as REALM, kNN-LM, FiD, ATLAS, and Flan-T5 in reasoning over the selected documents in different tasks. In particular, we analyze the reasoning failures of each of these models and study how the models' failures in reasoning are rooted in the retriever module as well as the language model. \ No newline at end of file +The emergence of large pretrained models has enabled language models to achieve superior performance in common NLP tasks, including language modeling and question answering, compared to previous static word representation methods. Augmenting these models with a retriever to retrieve the related text and documents as supporting information has shown promise in effectively solving NLP problems in a more interpretable way given that the additional knowledge is injected explicitly rather than being captured in the models' parameters. In spite of the recent progress, our analysis on retriever-augmented language models shows that this class of language models still lack reasoning over the retrieved documents. In this paper, we study the strengths and weaknesses of different retriever-augmented language models such as REALM, kNN-LM, FiD, ATLAS, and Flan-T5 in reasoning over the selected documents in different tasks. In particular, we analyze the reasoning failures of each of these models and study how the models' failures in reasoning are rooted in the retriever module as well as the language model. diff --git a/_posts/papers/2022-11-29-2211.16031.md b/_posts/papers/2023-05-22-2211.16031.md similarity index 98% rename from _posts/papers/2022-11-29-2211.16031.md rename to _posts/papers/2023-05-22-2211.16031.md index 0e9a4d7a..62c5cab0 100644 --- a/_posts/papers/2022-11-29-2211.16031.md +++ b/_posts/papers/2023-05-22-2211.16031.md @@ -1,9 +1,9 @@ --- title: Syntactic Substitutability as Unsupervised Dependency Syntax -venue: ArXiv +venue: EMNLP names: Jasper Jian, Siva Reddy tags: -- ArXiv +- EMNLP link: https://arxiv.org/abs/2211.16031 author: Jasper Jian categories: Publications diff --git a/_posts/papers/2023-09-19-2309.10954.md b/_posts/papers/2023-09-19-2309.10954.md index 9e1fc4c1..c2398bde 100644 --- a/_posts/papers/2023-09-19-2309.10954.md +++ b/_posts/papers/2023-09-19-2309.10954.md @@ -1,6 +1,6 @@ --- title: In-Context Learning for Text Classification with Many Labels -venue: arXiv.org +venue: GenBench workshop @ EMNLP names: Aristides Milios, Siva Reddy, Dzmitry Bahdanau tags: - arXiv.org @@ -18,4 +18,4 @@ categories: Publications ## Abstract -In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works. \ No newline at end of file +In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works. diff --git a/_posts/papers/2023-10-18-MAGNIFICo.md b/_posts/papers/2023-10-18-MAGNIFICo.md index 59b80bfa..c549bd20 100644 --- a/_posts/papers/2023-10-18-MAGNIFICo.md +++ b/_posts/papers/2023-10-18-MAGNIFICo.md @@ -12,6 +12,7 @@ tags: - Evaluation code: https://github.com/McGill-NLP/MAGNIFICo webpage: https://mcgill-nlp.github.io/MAGNIFICo +thumbnail: /assets/images/papers/magnifico.svg categories: Publications --- @@ -24,4 +25,4 @@ categories: Publications ## Abstract -Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. Therefore, it is crucial for LLMs to learn novel interpretations in-context. In this paper, we systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning. To facilitate our study, we introduce MAGNIFICo, an evaluation suite implemented within a text-to-SQL semantic parsing framework that incorporates diverse tokens and prompt settings to simulate real-world complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a surprisingly robust capacity for comprehending novel interpretations from natural language descriptions as well as from discussions within long conversations. Nevertheless, our findings also highlight the need for further improvements, particularly when interpreting unfamiliar words or when composing multiple novel interpretations simultaneously in the same example. Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts. \ No newline at end of file +Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. Therefore, it is crucial for LLMs to learn novel interpretations in-context. In this paper, we systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning. To facilitate our study, we introduce MAGNIFICo, an evaluation suite implemented within a text-to-SQL semantic parsing framework that incorporates diverse tokens and prompt settings to simulate real-world complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a surprisingly robust capacity for comprehending novel interpretations from natural language descriptions as well as from discussions within long conversations. Nevertheless, our findings also highlight the need for further improvements, particularly when interpreting unfamiliar words or when composing multiple novel interpretations simultaneously in the same example. Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts. diff --git a/assets/images/papers/magnifico.svg b/assets/images/papers/magnifico.svg new file mode 100644 index 00000000..ece8d253 --- /dev/null +++ b/assets/images/papers/magnifico.svg @@ -0,0 +1,167 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +