GitHub

Official implementation of ReAGent, as presented in our paper (ReLM at AAAI24):

ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models

cite us:

@inproceedings{zhao2024reagent,
  title= "ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models",
  author= "Zhao, Zhixue and Shan, Boxuan",
  booktitle= "Proceedings of AAAI Workshop on Responsible Language Models",
  year= "2024"，
  url = "https://arxiv.org/pdf/2402.00794.pdf"
}

Demo notebook

Run demo: the jupyter notebook src/notebook.ipynb or the Colab demo.

You could try your own model and input, you may see importance distribution plots like:

Environment Setup

We are running experiments with Python 3.10

Making virtual environment

(Option 1) With conda

install anaconda/miniconda
create a conda environment conda create -n env-reagent python=3.10
activate environment source activate env-reagent
install package dependencies pip install -r requirements.txt

(Option 2) With venv

install python 3.10
create a virtual environment python3 -m venv .venv
activate environment source .venv/bin/activate
install package dependencies pip install -r requirements.txt

Initialize NLTK

Run the following command

python setup_nltk.py

Evaluation Metrics

target_pos	target_token	source_soft_ns	source_soft_nc	rand_soft_ns	rand_soft_nc
10	Nintendo	0.000	0.004	0.000	0.019
11	,	0.255	0.017	0.000	0.007
12	and	0.000	0.006	0.004	0.017
13	was	0.008	0.008	0.000	0.006
14	released	0.027	0.005	0.000	0.005

target_tokens	metric_soft_ns	metric_soft_nc
Nintendo$,$ and$ was$ released	4.299	-0.309

Token level experiment

This experiment has been split into multiple steps.

Dataset

For the analogies experiment, we use the analogies dataset provided by Mikolev et al. Please download the text file into data/analogies.txt.

Then run the following commands to prepare the dataset for attribution.

cache_dir="cache/"

# select from: gpt2-medium gpt2-xl EleutherAI/gpt-j-6b facebook/opt-350m facebook/opt-1.3b KoboldAI/OPT-6.7B-Erebus
model_name="gpt2-medium"

# select from: gpt2 gpt2_xl gpt6b OPT350M OPT1B OPT6B
model_short_name="gpt2" 


mkdir -p "data/analogies/"$model_short_name
python src/data/prepare_evaluation_analogy.py \
    --analogies-file data/analogies.txt \
    --output-dir data/analogies/$model_short_name \
    --compact-output True \
    --schema-uri ../../docs/analogy.schema.json \
    --device cuda \
    --model $model_name \
    --cache_dir $cache_dir

It will generate a set of json files in the --output-dir

Run feature attribution

Run the following commands to start the feature attribution

config_name=top3_replace0.1_max5000_batch5

hyper="top3_replace0.1_max5000_batch5"

FA_name="ours"

importance_results="rationalization_results/analogies/$model_short_name_$FA_name/$hyper"
mkdir -p $importance_results

logfolder_shortname=logs/analogies/$model_short_name"_"$FA_name$hyper
mkdir -p $logfolder_shortname

python src/rationalization/run_analogies.py \
    --rationalization-config config/$config_name.json \
    --model $model_name \
    --tokenizer $model_name \
    --data-dir data/analogies/$model_short_name \
    --importance_results_dir $importance_results \
    --device cuda \
    --logfolder $logfolder_shortname \
    --input_num_ratio 1 \
    --cache_dir $cache_dir

It will load config of feature attribution method from --rationalization-config and output attribution results to --importance_results_dir

Run evaluation on attribution results

Run the following commands to start the evaluation

eva_output_dir="evaluation_results/analogies/"$model_short_name"_"$FA_name$hyper
mkdir -p $eva_output_dir

python src/evaluation/evaluate_analogies.py \
    --importance_results_dir $importance_results \
    --eva_output_dir $eva_output_dir \
    --model $model_name \
    --tokenizer $model_name \
    --logfolder $logfolder_shortname \
    --rationale_size_ratio 1 \
    --cache_dir $cache_dir

It will load attribution results form --importance_results_dir and output evaluation metrics to --eva_output_dir

Sequence level experiment

Dataset

A dataset is a plaintext TXT file in data/benchmark where each line is a prompt. Dataset used in our experiments are

wikitext.txt

tellmewhy2.txt

Run the experiment

The program of this experiment is in an End-to-End style. To run the experiment, run the following commands

cache_dir="cache/"

# select from: gpt2-medium gpt2-xl EleutherAI/gpt-j-6b facebook/opt-350m facebook/opt-1.3b KoboldAI/OPT-6.7B-Erebus
model_name="gpt2-medium"

# select from: gpt2 gpt2_xl gpt6b OPT350M OPT1B OPT6B
model_short_name="gpt2" 

# Select from: norm gradient_shap integrated_gradients input_x_gradient attention attention_rollout attention_last ours
FA_name="ours"

# Select from: wikitext tellmewhy wikitext2 tellmewhy2
dataset=wikitext

python src/sequence_rationalization.py \
    --model $model_name \
    --model_shortname $model_short_name \
    --method $FA_name \
    --stride 2 \
    --max_new_tokens 10 \
    --cache_dir $cache_dir \
    --testing_data_name $dataset

It will load dataset from data/benchmark/<--testing_data_name>.txt and output both attribution results and metrics to evaluation_results/benchmark/<--model_shortname>_<--method>/<--testing_data_name>

Name		Name	Last commit message	Last commit date
Latest commit History 317 Commits
cache		cache
config		config
data		data
docs		docs
evaluation_results		evaluation_results
logs/analogies		logs/analogies
misc		misc
notes		notes
rationalization_results		rationalization_results
runs		runs
src		src
visual		visual
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
extracting.log		extracting.log
llm_testing.py		llm_testing.py
notebook-colab.ipynb		notebook-colab.ipynb
requirments.txt		requirments.txt
setup_nltk.py		setup_nltk.py
summarise_all_results_OPT.py		summarise_all_results_OPT.py
summarise_ante_nod.py		summarise_ante_nod.py
summarise_sequencelevel.py		summarise_sequencelevel.py
summarise_tokenlevel.py		summarise_tokenlevel.py
summarise_tokenlevel_log.py		summarise_tokenlevel_log.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo notebook

Environment Setup

Making virtual environment

(Option 1) With conda

(Option 2) With venv

Initialize NLTK

Evaluation Metrics

Token level experiment

Dataset

Run feature attribution

Run evaluation on attribution results

Sequence level experiment

Dataset

Run the experiment

About

Releases

Packages

Contributors 2

Languages

License

casszhao/ReAGent

Folders and files

Latest commit

History

Repository files navigation

Demo notebook

Environment Setup

Making virtual environment

(Option 1) With conda

(Option 2) With venv

Initialize NLTK

Evaluation Metrics

Token level experiment

Dataset

Run feature attribution

Run evaluation on attribution results

Sequence level experiment

Dataset

Run the experiment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages