Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (website)

This repository contains the code for the paper Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (ACL Findings 2023).

It extends previous work on model editing by Meng et al. [1] by introducing a new benchmark, called CounterFact+, for measuring the specificity of model edits.

Attribution

The repository is a fork of MEMIT, which implements the model editing algorithms MEMIT (Mass Editing Memory in a Transformer) and ROME (Rank-One Model Editing). Our fork extends this code by additional evaluation scripts implementing the CounterFact+ benchmark. For installation instructions see the original repository.

Installation

We recommend conda for managing Python, CUDA, and PyTorch; pip is for everything else. To get started, simply install conda and run:

CONDA_HOME=$CONDA_HOME ./scripts/setup_conda.sh

$CONDA_HOME should be the path to your conda installation, e.g., ~/miniconda3.

Running Experiments

See INSTRUCTIONS.md for instructions on how to run the experiments and evaluations.

How to Cite

If you find our paper useful, please consider citing as:

@inproceedings{jason2023detecting,
title         = {Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark},
author        = {Hoelscher-Obermaier, Jason and Persson, Julia and Kran, Esben and Konstas, Ionnis and Barez, Fazl},
booktitle     = {Findings of ACL},
year          = {2023},
organization  = {Association for Computational Linguistics}
}

Name		Name	Last commit message	Last commit date
Latest commit History 508 Commits
baselines		baselines
dsets		dsets
experiment-scripts		experiment-scripts
experiments		experiments
figures		figures
hparams		hparams
memit		memit
results/combined		results/combined
rome		rome
scripts		scripts
setup_data		setup_data
util		util
website		website
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
INSTRUCTIONS.md		INSTRUCTIONS.md
LICENSE		LICENSE
README.md		README.md
globals.yml		globals.yml
scaling_curves.sh		scaling_curves.sh
zsre_evals.sh		zsre_evals.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (website)

Attribution

Installation

Running Experiments

How to Cite

About

Releases

Packages

Contributors 4

Languages

License

apartresearch/specificityplus

Folders and files

Latest commit

History

Repository files navigation

Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (website)

Attribution

Installation

Running Experiments

How to Cite

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages