ORB for real-time tracking on an ARM processor

Feature detection and matching is at the heart of several computer vision algorithms. In particular its used for object recognition, structure from motion, image stitching etc. SIFT and SURF are the 2 popular algorithms in this space. However both of them are patented which makes it unattractive to many users. More importantly both algorithms is computationally very intensive, and too slow for use in real-time algorithms.

ORB was proposed as an alternative to these 2 algorithms. THe original paper(Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski: ORB: An efficient alternative to SIFT or SURF. ICCV 2011: 2564-2571.) shows that its an order of magnitude faster than SURF and two orders of magnitude faster than SIFT. ORB is rotation invariant, but only partially scale invariant. We decided to test ORB for feature detection and tracking. The objective was to use it for tracking objects in a moving video, using a Cortex A9 ARM processor. Try if ORB can be used as an alternative to say Lucas Kanade Optical Flow.

The ORB feature descriptor and matching despite being significantly faster than SIFT and SURF, is not fast enough to meet real-time requirement for tracking on Cortex A9. Obviously we were excited at the prospect of making ORB running real-time on our OMAP4 Pandaboard.

Though OMAP4 has a dual-core 1GHZ A9 processor, for the sake of our experiments we ran our application in a single thread to ensure we used only a single core(using both cores can make the results confusing).

We only used a single scale for ORB since for tracking in video only a single frame is required. The keypoint detection used is FAST9 as it is with the original paper. We had optimized FAST9 and written about in a previous blog post.

Below are the results:

On single core Cortex A9(1GHz.) For single scale(since its for tracking application) and 500 keypoints.

UncannyCV

FAST9 kepoint detection and ORB descriptor creation – 33ms

ORB descriptor matching – 5.5ms

Total time = 33 + 5.5 = 38.5ms

OpenCV

FAST9 kepoint detection and ORB descriptor creation – 116ms

ORB descriptor matching(already Neon optimized by OpenCV) – 71ms

Total time = 116 + 71 = 187ms

For single scale and 1500 keypoints.

UncannyCV

FAST9 kepoint detection and ORB descriptor creation – 48.3ms

ORB descriptor matching – 58.5ms

Total time = 48.3 + 58.5 = 106.8ms

OpenCV

FAST9 kepoint detection and ORB descriptor creation – 148.3ms

ORB descriptor matching(already Neon optimized by OpenCV) – 493.7ms

Total time = 148.3 + 493.7 = 642ms

The images we used for our testing are 2 consecutive frames from the daimler pedestrian detection dataset.

The images we used for our testing are 2 consecutive frames from the daimler pedestrian detection dataset.