High quality, fast, modular reference implementation of SSD in PyTorch 1.0

High quality, fast, modular reference implementation of SSD in PyTorch 1.0

This repository implements SSD (Single Shot MultiBox Detector) . The implementation is heavily influenced by the projects ssd.pytorch , pytorch-ssd and maskrcnn-benchmark . This repository aims to be the code base for researches based on SSD.

Highlights

  • PyTorch 1.0
  • GPU/CPU NMS
  • Multi-GPU training and inference
  • Modular
  • Visualization(Support Tensorboard)

Installation

Requirements

  1. Python3
  2. PyTorch 1.0
  3. yacs
  4. GCC >= 4.9
  5. OpenCV

Build

# build nms
cd ext
python build.py build_ext develop 

Train

Setting Up Datasets

Pascal VOC

For Pascal VOC dataset, make the folder structure like this:

VOC_ROOT
|__ VOC2007
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ VOC2012
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ ...
 

Where VOC_ROOTdefault is datasetsfolder in current project, you can create symlinks to datasetsor export VOC_ROOT="/path/to/voc_root".

COCO

For COCO dataset, make the folder structure like this:

COCO_ROOT
|__ annotations
    |_ instances_valminusminival2014.json
    |_ instances_minival2014.json
    |_ instances_train2014.json
    |_ instances_val2014.json
    |_ ...
|__ train2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ val2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ ...
 

Where COCO_ROOTdefault is datasetsfolder in current project, you can create symlinks to datasetsor export COCO_ROOT="/path/to/coco_root".

Single GPU training

# for example, train SSD300:
python train_ssd.py --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pth 

Multi-GPU training

# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pth 

The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour . The pre-trained vgg weights can be downloaded here: s3.amazonaws.com/amdegroot-m .

Demo

Predicting image in a folder is simple:

python demo.py --config-file configs/ssd300_voc0712.yaml --weights path/to/trained/weights.pth --images_dir demo 

Then the predicted images with boxes, scores and label names will saved to demo/resultfolder. Currently, I provide weights trained with ssd300_voc0712.yamlhere: ssd300_voc0712_mAP77.83.pth(100 MB)

Performance

Origin Paper:

VOC2007 test
SSD300* 77.2
SSD512* 79.8

Our Implementation:

VOC2007 test
SSD300* 77.8
SSD512* -

Details:

VOC2007 test
SSD300*
mAP: 0.7783
aeroplane       : 0.8252
bicycle         : 0.8445
bird            : 0.7597
boat            : 0.7102
bottle          : 0.5275
bus             : 0.8643
car             : 0.8660
cat             : 0.8741
chair           : 0.6179
cow             : 0.8279
diningtable     : 0.7862
dog             : 0.8519
horse           : 0.8630
motorbike       : 0.8515
person          : 0.8024
pottedplant     : 0.5079
sheep           : 0.7685
sofa            : 0.7926
train           : 0.8704
tvmonitor       : 0.7554 
SSD512*
-