Warning, maintaining this repo is temporarily frozen.
This repository is based on matterport Mask-RCNN model implementation. The main things about the model were added from the original repository. The repo is an attempt to make Mask-RCNN model more transparent to researchers and more applicable in terms of inference optimization. Besides, new backbones were added in order to have a choice in balance between accuracy and speed, to make model more task-specific.
- v2.2.0, v2.3.4, v2.4.3, v2.5.1
-
ResNet [18, 34, 50, 101, 152]
-
ResNeXt [50, 101]
-
SE-ResNet [18, 34, 50, 101, 152]
-
SE-ResNeXt [50, 101]
-
SE-Net [154]
-
MobileNet [V1, V2]
-
EfficientNet [B0, B1, B2, B3, B4, B5, B6, B7]
Backbone keys: 'resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152', 'resnext50', 'resnext101', 'seresnet18', 'seresnet34', 'seresnet50', 'seresnet101', 'seresnet152', 'seresnext50', 'seresnext101', 'senet154', 'mobilenet', 'mobilenetv2', 'efficientnetb0', 'efficientnetb1', 'efficientnetb2', 'efficientnetb3', 'efficientnetb4', 'efficientnetb5', 'efficientnetb6', 'efficientnetb7',
# Define preferred Tensorflow version: tf2.{2,3,4,5}
# For example, Tensorflow 2.2 env:
$ conda create -n tf2.2 python=3.7
$ conda activate tf2.2
$ cd ./requirements && pip install -r requirements_tf2.2.txt
# You may also need onnx_graphsurgeon and tensorrt python binding for TensorRT optimization
$ pip install <TENSORRT_PATH>/python/<TENSORRT_PYTHON_BINDING.whl>
$ pip install <TENSORRT_PATH>/onnx_graphsurgeon/onnx_graphsurgeon-x.y.z-py2.py3-none-any.whl
-
There is a general config about Mask-RCNN building and training in
./src/common/config.py
represented as a dict. Prepare it for a specific task (CLASS_DICT
dictionary for class ids and names, other parameters are inCONFIG
dictionary.) -
Configure your dataset class. In the basic example we use general dataset class named
SegmentationDataset
for dealing with masks made in VGG Image Annotator.
In./src/samples/balloon
you can inspect preparedBalloonDataset
which inheritsSegmentationDataset
and process balloon image samples from the original repository.
In./src/samples/coco
you can inspect preparedCocoDataset
for MS COCO dataset.
Any prepared dataset class can be passed to DataLoader in./src/preprocess/preprocess.py
which generates batches.
You can also configure your own data augmentation. The default training augmentation is in./src/preprocess/augmentation.py
inget_training_augmentation
function. The default augmentation pipeline is based on albumentations library.See also:
./notebooks/example_data_loader_balloon.ipynb
./notebooks/example_data_loader_coco.ipynb
Basic example:
import tensorflow as tf
from preprocess import preprocess
from preprocess import augmentation as aug
from training import train_model
from model import mask_rcnn_functional
from common.utils import tf_limit_gpu_memory
from common.config import CONFIG
# Limit GPU memory for tensorflow container
tf_limit_gpu_memory(tf, 4500)
# Update info about classes in your dataset
CONFIG.update({'class_dict': {},
'num_classes':,
},
)
CONFIG.update({'meta_shape': (1 + 3 + 3 + 4 + 1 + CONFIG['num_classes']), })
# Init Mask-RCNN model
model = mask_rcnn_functional(config=CONFIG)
# Init training and validation datasets
base_dir = os.getcwd().replace('src', 'dataset_folder')
train_dir = os.path.join(base_dir, 'train')
val_dir = os.path.join(base_dir, 'val')
train_dataset = preprocess.SegmentationDataset(images_dir=train_dir,
classes_dict=CONFIG['class_dict'],
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
augmentation=aug.get_training_augmentation(),
**CONFIG
)
val_dataset = preprocess.SegmentationDataset(images_dir=val_dir,
classes_dict=CONFIG['class_dict'],
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
json_annotation_key=None,
**CONFIG
)
# train_model function includes dataset and dataloader initialization, callbacks configuration,
# a list of losses definition and final model compiling with optimizer defined in CONFIG.
train_model(model,
train_dataset=train_dataset,
val_dataset=val_dataset,
config=CONFIG,
weights_path=None)
Balloon dataset example:
Download balloon dataset here
import os
os.chdir('..')
import tensorflow as tf
from samples.balloon import balloon
from preprocess import preprocess
from preprocess import augmentation as aug
from training import train_model
from model import mask_rcnn_functional
from common.utils import tf_limit_gpu_memory
# Limit GPU memory for tensorflow container
tf_limit_gpu_memory(tf, 4500)
from common.config import CONFIG
CONFIG.update(balloon.BALLON_CONFIG)
# Init Mask-RCNN model
model = mask_rcnn_functional(config=CONFIG)
# Init training and validation datasets
base_dir = os.getcwd().replace('src', 'balloon')
train_dir = os.path.join(base_dir, 'train')
val_dir = os.path.join(base_dir, 'val')
train_dataset = balloon.BalloonDataset(images_dir=train_dir,
class_key='object',
classes_dict=CONFIG['class_dict'],
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
augmentation=aug.get_training_augmentation(),
json_annotation_key=None,
**CONFIG
)
val_dataset = balloon.BalloonDataset(images_dir=val_dir,
class_key='object',
classes_dict=CONFIG['class_dict'],
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
json_annotation_key=None,
**CONFIG
)
# train_model function includes dataset and dataloader initialization, callbacks configuration,
# a list of losses definition and final model compiling with optimizer defined in CONFIG.
train_model(model,
train_dataset=train_dataset,
val_dataset=val_dataset,
config=CONFIG,
weights_path=None)
See ./notebooks/example_training_balloon.ipynb
.
MS COCO dataset example:
import os
os.chdir('..')
import tensorflow as tf
from samples.coco import coco
from preprocess import preprocess
from preprocess import augmentation as aug
from training import train_model
from model import mask_rcnn_functional
from common.utils import tf_limit_gpu_memory
# Limit GPU memory for tensorflow container
tf_limit_gpu_memory(tf, 4500)
from common.config import CONFIG
CONFIG.update(coco.COCO_CONFIG)
# Init Mask-RCNN model
model = mask_rcnn_functional(config=CONFIG)
# You can also download dataset with auto_download=True argument
# It will be downloaded and unzipped in dataset_dir
base_dir = r'<COCO_PATH>/coco2017'
train_dir = os.path.join(base_dir, 'train')
val_dir = os.path.join(base_dir, 'val')
train_dataset = coco.CocoDataset(dataset_dir=base_dir,
subset='train',
year=2017,
auto_download=True,
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
augmentation=aug.get_training_augmentation(),
**CONFIG
)
val_dataset = coco.CocoDataset(dataset_dir=base_dir,
subset='val',
year=2017,
auto_download=True,
preprocess_transform=preprocess.get_input_preprocess(
normalize=CONFIG['normalization']
),
**CONFIG
)
train_model(model,
train_dataset=train_dataset,
val_dataset=val_dataset,
config=CONFIG,
weights_path=None)
See ./notebooks/example_training_coco.ipynb
.
- Logs folder with weights and scalars will appear outside
src
. Monitoring training with tensorboard tool:
$ tensorboard --log_dir=logs
See inference example in ./notebooks/example_inference_tf_onnx_trt_balloon.ipynb
The project suggests a straightforward way of Mask-RCNN inference optimization on x86_64 architecture and also on
NVIDIA Jetson devices (AArch64). Here you do not need to fix .uff graph and then optimize it with TensorRT. The model
optimizing way here is based on pure .onnx graph with only one prepared .onnx graph modification function for TensorRT.
You can inspect optimization steps with python in example_tensorflow_to_onnx_tensorrt_balloon.ipynb
.
Optimization steps:
- Initialize your model in inference mode and load its weights. Thus, your model won't include unnecessary layers that is used in training mode.
- Convert your tensorflow.keras model to .onnx with
tf2onnx
.
Inference with onnxruntime:
From this step, you can use generated .onnx graph in onnxruntime
and onnxruntime-gpu
inference.
Inference with TensorRT:
- Change your .onnx graph made on step 2. by including TensorRT-implemented Mask-RCNN layers with
onnx-graphsurgeon
library. This step is implemented inmodify_onnx_model
function. - Use TensorRT optimization for a modified .onnx-graph to prepare TensoRT-engine:
-
Get your TensorRT path: <TENSORRT_PATH>
-
Make sure that the following path in ~/.bashrc:
export LD_LIBRARY_PATH=<TENSORRT_PATH>/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=<TENSORRT_PATH>/targets/x86_64-linux-gnu/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Easy to edit example:
export LD_LIBRARY_PATH=/home/user/TensorRT-7.2.3.4/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=/home/user/TensorRT-7.2.3.4/targets/x86_64-linux-gnu/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
-
Several layers of MaskRCNN in TensorRT were implemented as special plugins. One of them is
proposalLayerPlugin
which contains general parameters to be changed. In this repo parameters are placed insrc/common/config.py
. Thus, to configure MaskRCNN special layers in TensorRT, it is important to rebuildnvinfer_plugin
with updated config.
# Clone TensorRT OSS
$ git clone https://github.com/NVIDIA/TensorRT.git
# Set your TensorRT version by switching the branch. Here is an example for 7.2
$ cd TensorRT/ && git checkout release/7.2 && git pull
$ git submodule update --init --recursive
$ mkdir -p build && cd build
-
Open header
TensorRT/plugin/proposalLayerPlugin/mrcnn_config.h
and change Mask-RCNN config according to the trained model configuration that is stored insrc/common/config.py
-
Learn about your compute capabilities: Your GPU Compute Capability. For example, Nvidia Geforce RTX 2060 has 7.5,
DGPU_ARCHS=75
. -
Build
nvinfer_plugin
:
$ cmake .. -DGPU_ARCHS=75 -DTRT_LIB_DIR=<TENSORRT_PATH>/lib -DTRT_OUT_DIR=`pwd`/out -DCMAKE_C_COMPILER=/usr/bin/gcc
$ make nvinfer_plugin -j$(nproc)
- Copy the
libnvinfer_plugin.so.x.y.z
output to the TensorRT library folder. Don't forget to back up the original build:
$ mkdir ~/backups
$ sudo mv <TensorRT>/lib/libnvinfer_plugin.so.7.2.3 ~/backups/libnvinfer_plugin.so.7.2.3.bak
$ sudo cp libnvinfer_plugin.so.7.2.3 <TensorRT>/lib/libnvinfer_plugin.so.7.2.3
# Update links
$ sudo ldconfig
# Check that links exist
$ ldconfig -p | grep libnvinfer
7.Generate TensorRT-engine in terminal with trtexec:
-
Basically, terminal does not recognize trtexec command. You can add to ~/.bashrc path to trtexec with alias:
alias trtexec='<TENSORRT_PATH>/bin/trtexec'
For successful run
example_tensorflow_to_onnx_tensorrt_balloon.ipynb
addTRTEXEC
to ~/.bashrc:export TRTEXEC='<TENSORRT_PATH>/bin/trtexec'
-
Update ~/.bashrc:
$ source ~/.bashrc
-
Run trtexec:
- fp32:
trtexec --onnx=<PATH_TO_ONNX_GRAPH> --saveEngine=<PATH_TO_TRT_ENGINE> --workspace=<WORKSPACE_SIZE> --verbose
- fp16:
trtexec --onnx=<PATH_TO_ONNX_GRAPH> --saveEngine=<PATH_TO_TRT_ENGINE> --fp16 --workspace=<WORKSPACE_SIZE> --verbose
- fp32:
This NVIDIA doc about TensorRT OSS on Jetson was very helpful for the manual:
- Update cmake on Jetson Ubuntu 18.04 OS:
$ sudo apt remove --purge --auto-remove cmake
$ wget https://github.com/Kitware/CMake/releases/download/v3.13.5/cmake-3.13.5.tar.gz
$ tar xvf cmake-3.13.5.tar.gz
$ cd cmake-3.13.5/
$ ./configure
$ make -j$(nproc)
$ sudo make install
$ sudo ln -s /usr/local/bin/cmake /usr/bin/cmake
- Clone TensorRT repository to build necessary Mask-RCNN plugins for custom layers:
$ git clone https://github.com/NVIDIA/TensorRT.git
$ cd TensorRT/ && git checkout release/7.1 && git pull
$ git submodule update --init --recursive
$ export TRT_SOURCE=`pwd`
$ mkdir -p build && cd build
-
Open header
TensorRT/plugin/proposalLayerPlugin/mrcnn_config.h
and change Mask-RCNN config according to the trained model configuration that is stored insrc/common/config.py
-
Build
nvinfer_plugin
:
$ /usr/local/bin/cmake .. -DGPU_ARCHS=72 -DTRT_LIB_DIR=/usr/lib/aarch64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=`pwd`/out
$ make nvinfer_plugin -j$(nproc)
- Copy the
libnvinfer_plugin.so.7.1.3
output to the library folder. Don't forget to back up the original build:
$ mkdir ~/backups
$ sudo mv /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3 ~/backups/libnvinfer_plugin.so.7.1.3.bak
$ sudo cp libnvinfer_plugin.so.7.1.3 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3
# Update links
$ sudo ldconfig
# Check that links exist
$ ldconfig -p | grep libnvinfer
- Generate TensorRT-engine in terminal with trtexec:
-
You can add to ~/.bashrc path to trtexec with alias if it is not known in terminal:
alias trtexec='<TENSORRT_PATH>/bin/trtexec'
-
Update ~/.bashrc:
$ source ~/.bashrc
-
Run trtexec:
fp32:
trtexec --onnx=<PATH_TO_ONNX_GRAPH> --saveEngine=<PATH_TO_TRT_ENGINE> --workspace=<WORKSPACE_SIZE> --verbose
fp16:
trtexec --onnx=<PATH_TO_ONNX_GRAPH> --saveEngine=<PATH_TO_TRT_ENGINE> --fp16 --workspace=<WORKSPACE_SIZE> --verbose
-
See inference optimization examples in ./src/notebooks/example_tensorflow_to_onnx_tensorrt_balloon.ipynb
Profiling with trtexec TensorRT tool with default maxBatch (1). For tests we took Mask-RCNN model with 2 classes including background. Note, that for comparison we used original Mask-RCNN model with ResNet101 and (1, 3, 1024, 1024) input shape and updated tensorflow v2 Mask-RCNN model with all supported backbones and with (1, 1024, 1024, 3), (1, 512, 512, 3) input shapes.
For original matterport Mask-RCNN we went through steps suggested in sampleUffMaskRCNN of official TensorRT github repository. For this model trtexec command:
$ trtexec --uff=mask_rcnn_resnet101_nchw.uff --uffInput=input_image,3,1024,1024 --output=mrcnn_detection,mrcnn_mask/Sigmoid --workspace=4096 --verbose
For tensorflow v2 Mask-RCNN trtexec command:
$ trtexec --onnx=maskrcnn_<BACKBONE_NAME>_<WIDTH>_<HEIGHT>_3_trt_mod.onnx --workspace=4096 --verbose
$ trtexec --onnx=maskrcnn_<BACKBONE_NAME>_<WIDTH>_<HEIGHT>_3_trt_mod.onnx --tacticSources=-cublasLt,+cublas --workspace=4096 --verbose
RTX2060:
Model | Backbone | Precision | Mean GPU compute, ms | Mean Host latency, ms | Input shape | Total params |
---|---|---|---|---|---|---|
original Mask-RCNN | ResNet101 | fp32 | 166.032 | 167.335 | (1, 3, 1024, 1024) | 64,158,584 |
original Mask-RCNN | ResNet101 | fp16 | 50.594 | 51.6662 | (1, 3, 1024, 1024) | 64,158,584 |
Mask-RCNN | ResNet18 | fp32 | 125.903 | 127.226 | (1, 1024, 1024, 3) | 32,571,861 |
Mask-RCNN | ResNet18 | fp16 | 46.6753 | 47.7547 | (1, 1024, 1024, 3) | 32,571,861 |
Mask-RCNN | ResNet34 | fp32 | 126.272 | 127.678 | (1, 1024, 1024, 3) | 42,687,445 |
Mask-RCNN | ResNet34 | fp16 | 49.6903 | 50.7716 | (1, 1024, 1024, 3) | 42,687,445 |
Mask-RCNN | ResNet50 | fp32 | 150.751 | 152.056 | (1, 1024, 1024, 3) | 45,668,309 |
Mask-RCNN | ResNet50 | fp16 | 54.0631 | 55.1411 | (1, 1024, 1024, 3) | 45,668,309 |
Mask-RCNN | ResNet101 | fp32 | 186.64 | 187.973 | (1, 1024, 1024, 3) | 64,712,661 |
Mask-RCNN | ResNet101 | fp16 | 58.0508 | 59.1242 | (1, 1024, 1024, 3) | 64,712,661 |
Mask-RCNN | MobileNet | fp32 | 115.363 | 116.763 | (1, 1024, 1024, 3) | 24,859,596 |
Mask-RCNN | MobileNet | fp16 | 40.6769 | 41.7582 | (1, 1024, 1024, 3) | 24,859,596 |
Mask-RCNN | MobileNetV2 | fp32 | 114.119 | 115.486 | (1, 1024, 1024, 3) | 23,958,348 |
Mask-RCNN | MobileNetV2 | fp16 | 43.8202 | 44.9006 | (1, 1024, 1024, 3) | 23,958,348 |
Mask-RCNN | EfficientNetB0 | fp32 | 138.189 | 139.534 | (1, 1024, 1024, 3) | 25,786,792 |
Mask-RCNN | EfficientNetB0 | fp16 | 56.5004 | 57.5949 | (1, 1024, 1024, 3) | 25,786,792 |
Mask-RCNN | EfficientNetB1 | fp32 | 134.059 | 135.417 | (1, 1024, 1024, 3) | 28,312,460 |
Mask-RCNN | EfficientNetB1 | fp16 | 60.3303 | 61.4217 | (1, 1024, 1024, 3) | 28,312,460 |
Mask-RCNN | EfficientNetB2 | fp32 | 135.788 | 137.12 | (1, 1024, 1024, 3) | 29,563,134 |
Mask-RCNN | EfficientNetB2 | fp16 | 64.0362 | 65.1281 | (1, 1024, 1024, 3) | 29,563,134 |
Mask-RCNN | EfficientNetB3 | fp32 | (1, 1024, 1024, 3) | 32,647,732 | ||
Mask-RCNN | EfficientNetB3 | fp16 | (1, 1024, 1024, 3) | 32,647,732 | ||
Mask-RCNN | ResNet18 | fp32 | 53.5696 | 53.9976 | (1, 512, 512, 3) | 31,786,197 |
Mask-RCNN | ResNet18 | fp16 | 19.6023 | 19.941 | (1, 512, 512, 3) | 31,786,197 |
Mask-RCNN | ResNet34 | fp32 | 59.9331 | 60.4002 | (1, 512, 512, 3) | 41,901,781 |
Mask-RCNN | ResNet34 | fp16 | 23.7166 | 24.063 | (1, 512, 512, 3) | 41,901,781 |
Mask-RCNN | ResNet50 | fp32 | 65.8216 | 66.2745 | (1, 512, 512, 3) | 44,882,645 |
Mask-RCNN | ResNet50 | fp16 | 25.6267 | 26.0099 | (1, 512, 512, 3) | 44,882,645 |
Mask-RCNN | ResNet101 | fp32 | 77.0433 | 77.48 | (1, 512, 512, 3) | 63,926,997 |
Mask-RCNN | ResNet101 | fp16 | 28.1458 | 28.498 | (1, 512, 512, 3) | 63,926,997 |
Mask-RCNN | MobileNet | fp32 | 52.2146 | 52.6336 | (1, 512, 512, 3) | 24,073,932 |
Mask-RCNN | MobileNet | fp16 | 19.5832 | 19.9254 | (1, 512, 512, 3) | 24,073,932 |
Mask-RCNN | MobileNetV2 | fp32 | 52.5706 | 53.0006 | (1, 512, 512, 3) | 23,172,684 |
Mask-RCNN | MobileNetV2 | fp16 | 21.9402 | 22.2757 | (1, 512, 512, 3) | 23,172,684 |
Mask-RCNN | EfficientNetB0 | fp32 | 57.0875 | 57.5132 | (1, 512, 512, 3) | 25,001,128 |
Mask-RCNN | EfficientNetB0 | fp16 | 24.5434 | 24.8687 | (1, 512, 512, 3) | 25,001,128 |
Mask-RCNN | EfficientNetB1 | fp32 | 59.3512 | 59.7616 | (1, 512, 512, 3) | 27,526,796 |
Mask-RCNN | EfficientNetB1 | fp16 | 22.6646 | 23.0058 | (1, 512, 512, 3) | 27,526,796 |
Mask-RCNN | EfficientNetB2 | fp32 | 67.8534 | 68.2614 | (1, 512, 512, 3) | 28,777,470 |
Mask-RCNN | EfficientNetB2 | fp16 | 31.5452 | 31.8778 | (1, 512, 512, 3) | 28,777,470 |
Mask-RCNN | EfficientNetB3 | fp32 | 68.9046 | 69.3455 | (1, 512, 512, 3) | 31,862,068 |
Mask-RCNN | EfficientNetB3 | fp16 | 34.7724 | 35.0879 | (1, 512, 512, 3) | 31,862,068 |
Jetson AGX Xavier:
Model | Backbone | Precision | Mean GPU compute, ms | Mean Host latency, ms | Input shape | Total params |
---|---|---|---|---|---|---|
original Mask-RCNN | ResNet101 | fp32 | 429.839 | 430.213 | (1, 3, 1024, 1024) | 64,158,584 |
original Mask-RCNN | ResNet101 | fp16 | 132.519 | 132.902 | (1, 3, 1024, 1024) | 64,158,584 |
Mask-RCNN | ResNet18 | fp32 | 301.87 | 302.241 | (1, 1024, 1024, 3) | 32,571,861 |
Mask-RCNN | ResNet18 | fp16 | 120.743 | 121.131 | (1, 1024, 1024, 3) | 32,571,861 |
Mask-RCNN | ResNet34 | fp32 | 326.506 | 326.893 | (1, 1024, 1024, 3) | 42,687,445 |
Mask-RCNN | ResNet34 | fp16 | 122.724 | 123.11 | (1, 1024, 1024, 3) | 42,687,445 |
Mask-RCNN | ResNet50 | fp32 | 375.936 | 376.317 | (1, 1024, 1024, 3) | 45,668,309 |
Mask-RCNN | ResNet50 | fp16 | 130.978 | 131.368 | (1, 1024, 1024, 3) | 45,668,309 |
Mask-RCNN | ResNet101 | fp32 | 470.027 | 470.423 | (1, 1024, 1024, 3) | 64,712,661 |
Mask-RCNN | ResNet101 | fp16 | 158.226 | 158.623 | (1, 1024, 1024, 3) | 64,712,661 |
Mask-RCNN | MobileNet | fp32 | 291.818 | 292.217 | (1, 1024, 1024, 3) | 24,859,596 |
Mask-RCNN | MobileNet | fp16 | 108.538 | 108.926 | (1, 1024, 1024, 3) | 24,859,596 |
Mask-RCNN | MobileNetV2 | fp32 | 285.315 | 285.688 | (1, 1024, 1024, 3) | 23,958,348 |
Mask-RCNN | MobileNetV2 | fp16 | 115.311 | 115.706 | (1, 1024, 1024, 3) | 23,958,348 |
Mask-RCNN | EfficientNetB0 | fp32 | 320.68 | 321.056 | (1, 1024, 1024, 3) | 25,786,792 |
Mask-RCNN | EfficientNetB0 | fp16 | 145.32 | 145.709 | (1, 1024, 1024, 3) | 25,786,792 |
Mask-RCNN | EfficientNetB1 | fp32 | 339.343 | 339.724 | (1, 1024, 1024, 3) | 28,312,460 |
Mask-RCNN | EfficientNetB1 | fp16 | 154.464 | 154.837 | (1, 1024, 1024, 3) | 28,312,460 |
Mask-RCNN | EfficientNetB2 | fp32 | 344.166 | 344.554 | (1, 1024, 1024, 3) | 29,563,134 |
Mask-RCNN | EfficientNetB2 | fp16 | 156.596 | 156.982 | (1, 1024, 1024, 3) | 29,563,134 |
Mask-RCNN | EfficientNetB3 | fp32 | (1, 1024, 1024, 3) | 32,647,732 | ||
Mask-RCNN | EfficientNetB3 | fp16 | (1, 1024, 1024, 3) | 32,647,732 | ||
Mask-RCNN | ResNet18 | fp32 | 147.313 | 147.43 | (1, 512, 512, 3) | 31,786,197 |
Mask-RCNN | ResNet18 | fp16 | 55.0673 | 55.1861 | (1, 512, 512, 3) | 31,786,197 |
Mask-RCNN | ResNet34 | fp32 | 160.904 | 161.024 | (1, 512, 512, 3) | 41,901,781 |
Mask-RCNN | ResNet34 | fp16 | 62.6873 | 62.8085 | (1, 512, 512, 3) | 41,901,781 |
Mask-RCNN | ResNet50 | fp32 | 176.807 | 176.925 | (1, 512, 512, 3) | 44,882,645 |
Mask-RCNN | ResNet50 | fp16 | 68.0678 | 68.1877 | (1, 512, 512, 3) | 44,882,645 |
Mask-RCNN | ResNet101 | fp32 | 200.177 | 200.301 | (1, 512, 512, 3) | 63,926,997 |
Mask-RCNN | ResNet101 | fp16 | 73.7332 | 73.8529 | (1, 512, 512, 3) | 63,926,997 |
Mask-RCNN | MobileNet | fp32 | 143.371 | 143.492 | (1, 512, 512, 3) | 24,073,932 |
Mask-RCNN | MobileNet | fp16 | 52.5975 | 52.7168 | (1, 512, 512, 3) | 24,073,932 |
Mask-RCNN | MobileNetV2 | fp32 | 143.504 | 143.623 | (1, 512, 512, 3) | 23,172,684 |
Mask-RCNN | MobileNetV2 | fp16 | 54.7317 | 54.85 | (1, 512, 512, 3) | 23,172,684 |
Mask-RCNN | EfficientNetB0 | fp32 | 157.063 | 157.185 | (1, 512, 512, 3) | 25,001,128 |
Mask-RCNN | EfficientNetB0 | fp16 | 66.0013 | 66.1224 | (1, 512, 512, 3) | 25,001,128 |
Mask-RCNN | EfficientNetB1 | fp32 | 158.944 | 159.064 | (1, 512, 512, 3) | 27,526,796 |
Mask-RCNN | EfficientNetB1 | fp16 | 65.623 | 65.7444 | (1, 512, 512, 3) | 27,526,796 |
Mask-RCNN | EfficientNetB2 | fp32 | 175.904 | 176.023 | (1, 512, 512, 3) | 28,777,470 |
Mask-RCNN | EfficientNetB2 | fp16 | 82.7281 | 82.8464 | (1, 512, 512, 3) | 28,777,470 |
Mask-RCNN | EfficientNetB3 | fp32 | 184.948 | 185.083 | (1, 512, 512, 3) | 31,862,068 |
Mask-RCNN | EfficientNetB3 | fp16 | 83.1854 | 83.3059 | (1, 512, 512, 3) | 31,862,068 |
- TRT-models profiling;
- NCWH support;
- Mixed precision training;
- Pruning options;
- Flexible backbones configuration;
- Update inference speed test tables;
- MS COCO weights;
- Tensorflow v2.6 support;
Alexander Popkov: @alexander-pv
Feel free to write me about the repo issues and its update ideas.