STITCH-O (Using anomaly detection to identify stitching artefacts in drone-based orthomosaics) is an advanced project aimed at automating the detection of stitching artifacts in drone-based orthomosaic images used in precision agriculture. This repository contains an adapted implementation of the UniAD (Unified Anomaly Detection) model, specifically tailored for detecting anomalies in orchard orthomosaic images.
Precision agriculture increasingly relies on drone-based imaging for monitoring crop health and general farm management. These images are merged into large-scale orthomosaics, which can sometimes contain stitching artifacts that compromise data quality. Currently, these artifacts are detected through manual inspection, which is time-consuming and expensive. STITCH-O aims to automate this process using state-of-the-art anomaly detection techniques.
- Adapted UniAD model for orthomosaic anomaly detection
- Custom data loading and preprocessing pipeline for large-scale orthomosaic images
- Enhanced evaluation metrics tailored for stitching artifact detection
- Inference pipeline for whole-orchard classification
- Baseline model implementation for performance comparison
The data used in this project has not been made publicly available.
Create a new virtual environment and install the required packages:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
To run the entire training pipeline (preprocessing, training, and evaluation):
train_pipeline.bat
Alternatively, you can run each component separately:
python .\Preprocessing\chunker.py preprocess_config.yaml
python ./Preprocessing/process_chunks.py chunks chunks_scaled
python .\Preprocessing\generate_metadata.py chunks_scaled -t
python .\UniAD\train_val.py --config train_config.yaml
For inference on new data:
inference_pipeline.bat
This will run the following steps:
- Segmentation of new orchard images
- Preprocessing of segmented images
- Running inference using the trained model
- Classifying whole orchards based on anomaly scores
You can also run the inference script directly:
python .\UniAD\run_inference.py --config inference_config.yaml
The data preparation process includes:
- Image chunking: Large orthomosaic images are divided into smaller, manageable chunks.
- Data scaling: Pixel values are normalized to handle variations across different orchard images.
- Train-Test split: The dataset is divided into training and testing sets.
- Metadata generation: Structured information about the dataset is created for efficient data handling.
The preprocessing pipeline supports multiple image layers (RGB, DEM, NDVI) and can be configured using YAML files.
STITCH-O uses an adapted version of the UniAD model. Key components include:
- Feature extractor: EfficientNet B4 or ResNet50 (configurable)
- Reconstruction model: Transformer-based encoder-decoder architecture
- Custom data loading and augmentation techniques
The training process includes:
- Mixed precision training using PyTorch's GradScaler
- Customizable learning rate scheduling
- Periodic validation and model checkpointing
- Logging of training metrics using TensorBoard
Training configuration can be adjusted using the train_config.yaml
file.
Evaluation metrics include:
- Area Under the Receiver Operating Characteristic (AUROC) curve
- Classification accuracy for Case 1 and Case 2 anomalies
- Custom thresholding technique to handle both types of anomalies
The evaluation process also includes an inference pipeline for whole-orchard classification.
After training for 250 epochs:
Anomaly Type | AUROC |
---|---|
Case 1 | 0.99420 |
Case 2 | 0.96136 |
Mean | 0.97778 |
The model uses dual thresholding:
- Case 1 anomalies: Below ~35 (lower anomaly scores than normal images)
- Case 2 anomalies: Above ~60 (higher anomaly scores than normal images)
The STITCH-O implementation significantly outperforms the baseline model, especially for Case 2 anomalies.
The project includes a baseline model for comparison with the UniAD implementation. This baseline model serves as a benchmark to evaluate the performance improvements achieved by the more complex UniAD approach.
The baseline model consists of two main components:
-
Feature Extractor: Uses EfficientNet B4 pre-trained on ImageNet. The first four layers are used and frozen during training.
-
U-Net Reconstruction Model: A custom U-Net architecture designed to reconstruct the extracted features.
- Implements a simplified anomaly detection approach based on feature reconstruction.
- Uses Mixed Precision Training with PyTorch's GradScaler for efficient computation.
- Includes customizable learning rate scheduling options.
- Provides flexible configuration through YAML files.
To train and evaluate the baseline model:
python baseline_model.py baseline_config.yaml
Performance The baseline model achieves the following results:
Case 1 AUROC: 0.9963 Case 2 AUROC: 0.8874 Mean AUROC: 0.9419
While the baseline performs well, especially for Case 1 anomalies, it is outperformed by the UniAD implementation, particularly for the more subtle Case 2 anomalies.
To visualize the results and compare different experiments, use the plot-experiments.py
script:
python plot-experiments.py /path/to/experiment/directory
This script generates the following plots:
- Overall comparison of Case 1 and Case 2 AUROC across all models
- Individual experiment results showing Case 1 AUROC, Case 2 AUROC, and Average AUROC for different model configurations
The script processes CSV files in the specified directory and its subdirectories, allowing for easy comparison of multiple experiments and model configurations.
- Original UniAD implementation: UniAD GitHub Repository
- Segment Anything Model (SAM) for mask generation: SAM GitHub Repository
- EfficientNet implementation: EfficientNet PyTorch
- Project supervisor: Patrick Marais, University of Cape Town
- This project was developed as part of a CS Honours Project at the University of Cape Town
For detailed implementation and usage instructions, please refer to the individual script files and configuration YAML files in the repository.