Skip to content

Latest commit

 

History

History
235 lines (160 loc) · 8.1 KB

README.md

File metadata and controls

235 lines (160 loc) · 8.1 KB

P-DIFF: Learning Classifier with Noisy Labels based on Probability Difference Distributions

ICPR2020 Paper link

P-DIFF+: Improving Learning Classifier with Noisy Labels by Noisy Negative Learning Loss

Neural Networks Paper link

Contents

  1. Requirements
  2. Dataset
  3. Folders
  4. How-to-train
  5. Usage-of-P-DIFF-layer
  6. Complexity-of-P-DIFF
  7. Experiment-environment
  8. Experiment-settings
  9. Model-list

Requirements

  1. Python
  2. Caffe

Dataset

Training and Testing dataset:

  1. mnist
  2. cifar10
  3. cifar100
  4. [miniimage]
  5. Cloth1M

Folders

The structure of code folders:

Dataset Description
README.md The detailed instruction for P-DIFF reproduction.
train.sh The training entry script of P-DIFF.
test.sh The testing entry script of P-DIFF.
caffe The compiled official caffe repo.
code Some data downloading and processing codes saved in this folder.
data The training and testing datasets used in paper.
layer The implement of P-DIFF layer in caffe.
log The folder used to save training logs.
models The folder used to save training models.
prototxt The prototxt files used to train or test models in different datasets.

How-to-train

We demostrate the training process of cifar10 dataset contains 50% symmetry noise

Pipeline:

Step 1. Clone caffe repo to ./caffe folder and compile it after install its requirements.
        cd caffe
        mkdir build
        cd build
        cmake ..
        make -j8

Step 2. Add P-DIFF layer to caffe layers and recompile caffe project.

Step 3. Download mnist, cifar-10, cifar-100 and cloth1m datasets.(You can contact the author to download miniimage)
        python ./code/download.py --dataset=cifar10

Step 4. Corrupt the labels of training dataset by using ./code/corrupt.py script
        python ./code/corrupt.py --dataset=cifar10 --noise_type=SYMMETRY --noise_rate=0.50

Step 5. Generate respective lmdb by using caffe's converting tool(need multi-label supporting).
        bash ./code/convert.sh cifar10 SYMMETRY 50

Step 6. Configure the training dataset path in train_val.prototxt file.
        edit the parameters of ${noise_type}, ${noise_rate} and p_diff_layer in ./prototxt/train_val.prototxt.cifar10
Step 7. Train the dataset by using command of caffe.
        bash ./train.sh cifar10

Step 8. Test the dataset by using command of caffe.
        bash ./test.sh cifar10 SYMMETRY 50

Usage-of-P-DIFF-layer

The usage of P-DIFF layer in train_val.prototxt is described below:

layer {
  name: "fix_prob"
  type: "PDIFF"

  # input of this layer
  # bottom[0] is used for the forward of network
  bottom: "prob"

  # bottom[1] is used for computing sample weight, which could be different from bottom[0]
  bottom: "prob"

  # bottom[2] is the class label which indicate the sample's category
  bottom: "label000"

  # bottom[3](optional) is the noise label which indicate whether the sample is noise,
  # just used for drawing pdf_clean and pdf_noise, not used for training.
  # The second label is generated by multi-label lmdb converter.
  # We can discard this input in general.
  #bottom: "label001"

  # output of this layer
  top: "fix_prob"

  # parameters of this layer
  p_diff_param {
    # We use a queue the maintain the delta distribution, the size of queue is slide_batch_num x batch_size
    slide_batch_num: 100

    # The iteration number per epoch
    # Its value equal to total training samples number divide batch_size
    # Here is 50,000 / 128
    epoch_iters: 390

    # This is the switch whether use auto noise method:
    # "on" means this layer will compute a noise rate automatically,
    # "off" means this layer will use a specific noise rate.
    use_auto_noise_ratio: false

    # if the switch of use_auto_noise_ratio is off, we need to set the specific noise rate.
    noise_ratio: 0.50

    # print some training information, like noise rate, threshold of zeta, pcf, weight, etc.
    # which are used for debugging and draw figures.
    # we turn off this switch usually.
    #debug: true
    # this is the prefix of log file name, which stored the information under debug mode.
    #debug_prefix: "cifar10_noise_symmetric_50"
  }
}

Complexity

If we have known the batch size, slide batch num and bin size (200 in our training)

The time complexity of P-DIFF per iteration is

O(batch_size) x [O(bin_size) + O(slide_batch_num) + O(k)]

k is a constant value.

The space complexity of P-DIFF per iteration is

O(slide_batch_num) x O(batch_size) + k x O(bin_size)

k is a constant value.

Experiment-environment

Hardware:
  • GPU: 8 cards of GeForce GTX TITAN X
  • CPU: 48 cores of Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz
  • Memory: 512GB
Operation System:
  • Ubuntu 18.04.2

Experiment-settings

Dataset preprocess:
  • mnist: Gray image with size 28x28, without cropping, total 60,000 training images and 10,000 test images.
    This is a clean dataset, we corrupt it in three kinds of noise type.

  • cifar10: Color image with size 32x32, without cropping, total 50,000 training images and 10,000 test images.
    This is a clean dataset, we corrupt it in three kinds of noise type.

  • cifar100: Color image with size 32x32, without cropping, total 50,000 training images and 10,000 test images.
    This is a clean dataset, we corrupt it in three kinds of noise type.

  • miniimage: Color image with size 84x84, without cropping, total 50,000 training images and 10,000 test images.
    This is a clean dataset, we corrupt it in three kinds of noise type.

  • cloth1m: Color image resize to 256x256 size, cropped 224x224 region, total 1,047571 training images and 10,526 test images.
    This is a noisy dataset.
    It should be emphasized that the original clothing1m contains 1,000,000 noisy training data and 47,571 clean training data.

Training parameters:
  • Optimizer: SGD
  • Learning rate: 0.001
  • Momentum: 0.9
  • Batch size: 128
  • ${T}_{max}$: 200
  • ${T}_{k}$: 20
Backbone:
  • CoNet (9 layers CNN network) for mnist, cifar10 and cifar100, miniimage, cloth1m.cnn the network structure is shown in supplementary material.
  • ResNet101 for cloth1m.resnet101
Training parameters:
  • ${\zeta}$: 0.9
  • M: 0.2

Citation:

  • @inproceedings{P-DIFF,
    author = {Wei Hu and QiHao Zhao and Yangyu Huang and Fan Zhang},
    title = {{P-DIFF:} Learning Classifier with Noisy Labels based on Probability Difference Distributions},
    booktitle = {{ICPR}},
    pages = {1882--1889},
    publisher = {{IEEE}},
    year = {2020}
    }

  • @article{zhao2021p,
    title={P-DIFF+: Improving learning classifier with noisy labels by Noisy Negative Learning loss},
    author={Zhao, QiHao and Hu, Wei and Huang, Yangyu and Zhang, Fan},
    journal={Neural Networks},
    volume={144},
    pages={1--10},
    year={2021},
    publisher={Elsevier}
    }