This project implements various image classification models for the CIFAR-100 dataset, including ResNet, ResNeXt, ViT, Swin Transformer, PyramidNet, and EfficientNet. It was developed as part of the Learning Vision Intelligence (LVI) course project.
The CIFAR-100 dataset consists of 60,000 32x32 color images in 100 classes. This project, developed as part of the Learning Vision Intelligence (LVI) course, aims to develop and compare high-performance image classification models using various state-of-the-art architectures.
- Python 3.11.7
- PyTorch 2.1.2+cu121
- CUDA-capable GPU (recommended)
- For other required libraries, please refer to the requirements file
-
Clone the repository:
git clone https://github.com/bigeco/lvi-cifar100-classifier-pytorch.git cd lvi-cifar100-classifier-pytorch
-
Install the required packages:
pip install -r requirements.txt
To ensure reproducibility, we use a fixed random seed. The default random seed is set to 42.
To change the random seed, use the --seed
argument when running the training script:
python3 src/train.py --seed 123
We use the following code to set the random seed:
import os
import torch
import numpy as np
import random
def seed_everything(seed):
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
def main(args):
seed_everything(args.seed) # Seed 고정
This function is called at the beginning of our training script to ensure consistent results across runs.
To train the model:
python3 src/train.py --model_name "resnet9" --epochs 240 --batch_size 128 --optimizer_name "Adam" --lr 0.005 --scheduler_name "OneCycleLR" --select_transform 'RandomCrop,RandomHorizontalFlip,ColorJitter' --mixup True
python3 src/train.py --model_name "resnet18" --epochs 100 --batch_size 64 --optimizer_name "AdamW" --lr 0.008 --criterion_name "LabelSmoothingLoss" --scheduler_name "OneCycleLR" --select_transform 'RandomCrop,RandomHorizontalFlip,ColorJitter' --mixup True --split True --train_ratio 0.8
python3 src/train.py --model_name "resnext50" --epochs 100 --batch_size 64 --optimizer_name "Adam" --lr 0.001 --select_transform 'RandomCrop,RandomHorizontalFlip,ColorJitter' --mixup True --split True --train_ratio 0.8
python3 src/train.py --model_name "densenet201" --epochs 100 --batch_size 64 --optimizer_name "Adam" --lr 0.001 --select_transform '' --split True --train_ratio 0.8
python3 src/train.py --model_name "wide_resnet28_10" --epochs 100 --batch_size 64 --optimizer_name "Adam" --lr 0.001 --select_transform 'RandomCrop,RandomHorizontalFlip,ColorJitter' --mixup True --split True --train_ratio 0.8
python3 src/train.py --model_name "vit" --epochs 100 --batch_size 64 --optimizer_name "Adam" --lr 0.001 --select_transform '' --split True --train_ratio 0.8
python3 src/train.py --model_name "swin6" --epochs 100 --batch_size 64 --optimizer_name "AdamW" --lr 0.001 --weight_decay 0.05 --scheduler_name "CosineAnnealingLR" --select_transform 'RandomCrop,RandomHorizontalFlip' --split True --train_ratio 0.8
python3 src/train.py
The CIFAR-100 dataset is automatically downloaded by the PyTorch torchvision
library. It includes:
- 50,000 training images
- 10,000 testing images
- 100 classes
- 32x32 pixel resolution
We implemented PyramidNet with ShakeDrop regularization and achieved the highest top-1 accuracy of 83.42% among all models on the CIFAR-100 dataset.
- Optimizer: SGD
- Learning Rate: 0.1 with MultiStepLR and ReduceLROnPlateau
- Batch Size: 128
- Epochs: 200
- Data Augmentation: Random crop, Random horizontal flip, AutoAugment, Cutout
- loss function: LabelSmoothingLoss
PyramidNet with Shake-Drop achieved Best Score.
Model | Seed | Loss | Top-1 Accuracy | Top-5 Accuracy | Super Top-1 Accuracy |
---|---|---|---|---|---|
PyramidNet | 42 | 1.26 | 83.42% | 97.42% | 91.14% |
This project was developed collaboratively by the following team members:
-
Lee Songeun
-
Role:
implement
- utils(e.g. top-k accuracy function, train/valid utils) and CifarDataset class
- models(e.g. ResNet9, ResNet56, ResNet110, Swin, PyramidNet with Shake Drop, Wide-ResNet)
- lr scheduler (OneCycleLR, LambdaLR, StepLR, CosineAnnealingLR, CosineAnnealingWarmRestarts)
-
GitHub: @bigeco
-
-
Park Jihye
-
Role:
implement
- models(e.g. EfficientNet, ResNet18, ResNeXt50, DenseNet201)
- augmentation(rotation, flipping)
- Loss functions(Focal Loss and Label Smoothing Loss functions)
-
GitHub: @park-ji-hye
-
-
Song Daeun
-
Role:
implement
- models(e.g. ResNet, ResNeXt, ViT)
- augmentation(Mixup, Color Jittering, Cutout, AutoAugment)
- Modifying Model(Resnet, ResNeXt baseline)
-
GitHub: @Song-Daeun
-
This project is licensed under the MIT License - see the LICENSE file for details.