Understanding Retrieval Augmentation for Long-Form Question Answering

This is the repository for the paper Understanding Retrieval Augmentation for Long-Form Question Answering.

Requirements

Our code requires PyTorch (torch), HuggingFace Transformers (transformers) and the OpenAI API package (openai). Most of our experiments were run with torch==2.0.1, transformers==4.30.1 and openai==0.27.8 on Python 3.10.6.

Collected Data

The data folder contains:

questions with corresponding human and model answers,
evidence documents retrieved for each of the questions,
prompt templates used for creating the prompts that were passed to the models, and
human annotations of the attributability of each answer sentence to corresponding evidence documents, for a subset of the question and models.

Reproduction

Prompting LMs

Details for how to reproduce our prompting of the LMs are in src/answer_generation. Our setup is easily reusable with different questions, documents, prompts and/or models.

Attribution Prediction

We benchmark several approaches on attribution of answer sentences using collected data. The details can be found in src/Automatic/.

Retrieving Bing Documents

Steps for retrieving Bing evidence documents can be found in src/bing_search.

Citation

@article{chen2023understanding,
  title={Understanding Retrieval Augmentation for Long-Form Question Answering},
  author={Chen, Hung-Ting and Xu, Fangyuan and Arora, Shane A and Choi, Eunsol},
  journal={arXiv preprint arXiv:2310.12150},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Understanding Retrieval Augmentation for Long-Form Question Answering

Contents

Requirements

Collected Data

Reproduction

Prompting LMs

Attribution Prediction

Retrieving Bing Documents

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Understanding Retrieval Augmentation for Long-Form Question Answering

Contents

Requirements

Collected Data

Reproduction

Prompting LMs

Attribution Prediction

Retrieving Bing Documents

Citation