In this paper, we focus on addressing the problem of heavy occlusion in visual object racking. In this setting, we decompose the task of tracking into translation and scale estimations of objects. We use hierarchical convolutional features to estimate target position and update translation model, and we use HOG features for the scale filter. In addition, we evaluate the translation’s reliability according to the correlation responses map which is the result of correlation detection. Then we propose a new method to update model according to the reliability. Experiments are performed on 28 benchmark sequences with significant scale variations, it shows that the proposed algorithm performs favorably against state-of-the-art methods in terms of accuracy and robustness.