Understanding-and-Diagnosing-Visual-Tracking-Systems

ICCV15文章,主要关注short-term single-object model-free tracking,文章主要是想探究现在在benchmark上进行的实验是不是足以证明跟踪器的好坏。

short-term:在目标跟丢了的情况下跟踪器不会继续跟踪。
single-object:不解释了。
model-free:在跟踪过程中只有第一帧有一个训练样本。
名词解释可以看VOTchallenge2014的paper。

作者把整个跟踪过程分成五个部分来考虑:Motion Model, Feature Extractor, Observation Model, Model Updater, Ensemble Post-processor.
Pipline

对比实验的评价标准就是AUC和precision plot。

Validation and Analysis

文章从上面分开的五个部分分别对比了各个部分中经常使用的设置,并且做了分析。

Feature Extractor

特征的选择上面也会导致结果有很大的不同,文章里面比较了五个常用的特征,还提到CNN特征可以用,效果很好但是跟踪的速度慢。
作者给出的总结:The feature extractor is the most important component of a tracker. Using proper features can dramatically improve the tracking performance. Developing a good and effective feature representation for tracking is still an open problem.

Observation Model

前面作者把Observation Model分成了generative model和discriminative model,前者(产生式的过程)的代表是PCA、sparse coding、dictionary learning,后者(分类器)的代表有boosting、structured output SVM、deep learning。
文章说大部分人使用的是discriminative model,所以只分析了这个。对比了logistic regression、ridge regression、SVM、Structured output SVM。
对比结果显示Observation model的选择和特征有很大的关系,在用raw pixel和HOG特征的时候结果有很大差距。
作者的总结:Different observation models indeed affect the performance when the features are weak. However, the performance gaps diminish when the features are strong enough. Consequently, satisfactory results can be obtained even using simple classifiers from textbooks.

Motion Model

Motion Model产生一系列的candidate,文章对比了粒子滤波、滑动窗、半径滑动窗,结果显示粒子滤波的效果最好,但是差距不是大。另外还考虑了fast motion和scale variation的情况,得到将video resize到固定大小能够提升效果的结果。
作者的结论:When compared to the feature extractor and observation model components, in general the motion model only has minor effects on the performance. However, under scale variation and fast motion, setting the parameters properly is still crucial to obtaining good performance. Furthermore, for some specific scenarios such as egocentric video, it is beneficial to design the motion model carefully. Due to its ability to adapt to scale changes which are not uncommon in practice, we will still take the particle filter approach with resized input as the default motion model in the sequel.

Model Updater

这部分不是很懂实验设计的理由,用两种不同的更新策略做对比。
作者的总结:Although implementation of the model updater is often treated as engineering tricks in papers especially for discriminative trackers, their impact on performance is usually very significant and hence is worth studying. Unfortunately, very few work focuses on this component.

Ensemble Post-processor

作者总结:The ensemble post-processor can improve the performance substantially especially when the trackers have high diversity. This component is universal and effective yet it is least explored.

Conclusion

Feature Extractor最重要,其次是Observation Model,然后model updater也能影响结果,但是现行的算法大多不注重这个点的提升,Ensemble Post-processor比较普遍,没有很多人关注这个点,至于Motion Model,如果在这上面采取合适的修改也可以得到不错的效果。


Understanding-and-Diagnosing-Visual-Tracking-Systems
http://yoursite.com/2016/06/13/Understanding-and-Diagnosing-Visual-Tracking-Systems/
Author
John Doe
Posted on
June 13, 2016
Licensed under