This is the repository for the paper Understanding Retrieval Augmentation for Long-Form Question Answering.
Our code requires PyTorch (torch
), HuggingFace Transformers (transformers
) and the OpenAI API package (openai
). Most of our experiments were run with torch==2.0.1
, transformers==4.30.1
and openai==0.27.8
on Python 3.10.6.
The data folder contains:
- questions with corresponding human and model answers,
- evidence documents retrieved for each of the questions,
- prompt templates used for creating the prompts that were passed to the models, and
- human annotations of the attributability of each answer sentence to corresponding evidence documents, for a subset of the question and models.
Details for how to reproduce our prompting of the LMs are in src/answer_generation
. Our setup is easily reusable with different questions, documents, prompts and/or models.
We benchmark several approaches on attribution of answer sentences using collected data. The details can be found in src/Automatic/
.
Steps for retrieving Bing evidence documents can be found in
src/bing_search
.
@article{chen2023understanding,
title={Understanding Retrieval Augmentation for Long-Form Question Answering},
author={Chen, Hung-Ting and Xu, Fangyuan and Arora, Shane A and Choi, Eunsol},
journal={arXiv preprint arXiv:2310.12150},
year={2023}
}