[Read Paper] MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors

Published: by Creative Commons Licence (Last updated: )

MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors

MaxpoolNMS, a parallelizable alternative to the NMS algorithm, which is based on max-pooling classification score maps.

NMS

NMS is an essential block as it removes duplicate detections, hence reducing false positives. Both the region proposal network and object detection network employ NMS as a post-processing step.

System diagram of two-stage object detectors

When applied in region proposal network, it first sorts all the candidate detection boxes according to their objectness scores, followed by two nested loops to greedily select high score boxes and delete other boxes that overlap significantly with the selected ones. The inner loop is parallelizable, but the outer loop is sequential in nature.

Execution time of convolution and GreedyNMS on different GPU platforms

Multi-scale max-pooling

As score maps are generated by multi-scale anchors, it is natural to use multi-scale kernel sizes for different score maps when conducting max-pooling. The size of the anchor is $h \times w$.

$ksize_x, stride_x = max(1, round(\frac{\alpha w}{s}))$

$ksize_y, stride_t = max(1, round(\frac{\alpha h}{s}))$

Multi-Scale Max-Pooling

Multi-channel max-pooling

An object can produce multiple peaks on neighboring score maps.

Multi-Channel Max-Pooling Across Aspect Ratios

Multi-Channel Max-Pooling Across Scales