AnyAttack: Official Code for "AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models"

This repository provides the official implementation of the paper "AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models." Our method demonstrates high effectiveness across a wide range of commercial Vision-Language Models (VLMs).

Figure: AnyAttack results on various commercial VLMs

Key Features

High Transferability: Adversarial examples generated by AnyAttack show strong transferability across different VLMs.
Scalability: Our approach is designed to work effectively on large-scale datasets.
Self-supervised: AnyAttack utilizes self-supervised learning techniques for generating targeted adversarial examples.

Installation

Step 1: Environment Setup

Create Conda environment for LAVIS:
Set up the LAVIS environment for BLIP, BLIP2, and InstructBLIP. Follow the instructions here.
Optional: Mini-GPT4 environment: If you plan to evaluate on Mini-GPT4 series models, set up an additional environment according to Mini-GPT4's installation guide.
Data Preparation:
- Required Datasets:
  - MSCOCO and Flickr30K: Available here.
  - ImageNet: Download and prepare separately.
- Optional Dataset:
  - LAION-400M: Only required if you plan to pretrain on LAION-400M. Use the img2dataset tool for downloading.

Step 2: Download Checkpoints and JSON Files

Download pretrained models and configuration files from OneDrive.
Place the downloaded files in the project root directory.

Step 3 (Optional): Training and Fine-tuning

You can either use the pretrained weights from Step 2 or train the models from scratch.

Optional: Pretraining on LAION-400M: If you choose to pretrain on LAION-400M:
```
./scripts/main.sh
```
Replace "YOUR_LAION_DATASET" with your LAION-400M dataset path.
Fine-tuning on downstream datasets:
```
./scripts/finetune_ddp.sh
```
Adjust the dataset, criterion, and data_dir parameters as needed.

Step 4: Generate Adversarial Images

Use the pretrained decoder to generate adversarial images:

./scripts/generate_adv_img.sh

Ensure that datasets from Step 1 are stored under the DATASET_BASE_PATH directory, and set PROJECT_PATH to the current project directory.

Step 5: Evaluation

Evaluate the trained models on different tasks:

Image-text retrieval:
```
./scripts/retrieval.sh
```

Multimodal classification:

python ./scripts/classification_shell.py

Image captioning:
```
python ./scripts/caption_shell.py
```

Demo

We've added a demo.py script for easy demonstration of AnyAttack. This script allows users to generate adversarial examples using a single target image and a clean image.

To run the demo:

python demo.py --decoder_path path/to/decoder.pth --clean_image_path path/to/clean_image.jpg --target_image_path path/to/target_image.jpg --output_path output.png

For more options and details, please refer to the demo.py file.

Citation

If you find this work useful for your research, please consider citing:

@article{zhang2024anyattack,
      title={AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models}, 
      author={Jiaming Zhang and Junhong Ye and Xingjun Ma and Yige Li and Yunfan Yang and Jitao Sang and Dit-Yan Yeung},
      year={2024},
      journal={arXiv preprint arXiv:2410.05346},
}

Contact

For any questions or concerns, please open an issue in this repository or contact the authors directly.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
attacked_model		attacked_model
lavis_tool		lavis_tool
models		models
scripts		scripts
README.md		README.md
caption.py		caption.py
classification.py		classification.py
dataset.py		dataset.py
demo.py		demo.py
example.jpg		example.jpg
finetune_ddp.py		finetune_ddp.py
generate_adv_img.py		generate_adv_img.py
main_ddp.py		main_ddp.py
retrieval.py		retrieval.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnyAttack: Official Code for "AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models"

Key Features

Installation

Step 1: Environment Setup

Step 2: Download Checkpoints and JSON Files

Step 3 (Optional): Training and Fine-tuning

Step 4: Generate Adversarial Images

Step 5: Evaluation

Demo

Citation

Contact

About

Releases

Packages

Contributors 2

Languages

jiamingzhang94/AnyAttack

Folders and files

Latest commit

History

Repository files navigation

AnyAttack: Official Code for "AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models"

Key Features

Installation

Step 1: Environment Setup

Step 2: Download Checkpoints and JSON Files

Step 3 (Optional): Training and Fine-tuning

Step 4: Generate Adversarial Images

Step 5: Evaluation

Demo

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages