Detection of moving objects in image plane for robot navigation using monocular vision

biomed - Wang Yin-Tien , Sun Chung-Hsun , Chiou , Chiou Ming-Jang

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

22 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

This article presents an algorithm for moving object detection (MOD) in robot visual simultaneous localization and mapping (SLAM). This MOD algorithm is designed based on the defining epipolar constraint for the corresponding feature points on image plane. An essential matrix obtained using the state estimator is utilized to represent the epipolar constraint. Meanwhile, the method of speeded-up robust feature (SURF) is employed in the algorithm to provide a robust detection for image features as well as a better description of landmarks and of moving objects in visual SLAM system. Experiments are carried out on a hand-held monocular camera to verify the performances of the proposed algorithm. The results show that the integration of MOD and SURF is efficient for robot navigating in dynamic environments.

Sujets

Monocular vision

Informations

Publié par	biomed
Publié le	01 janvier 2012
Nombre de lectures	12
Langue	English
Poids de l'ouvrage	4 Mo

Extrait

Wanget al.EURASIP Journal on Advances in Signal Processing2012,2012:29 http://asp.eurasipjournals.com/content/2012/1/29

R E S E A R C HOpen Access Detection of moving objects in image plane for robot navigation using monocular vision * YinTien Wang , ChungHsun Sun and MingJang Chiou

Abstract This article presents an algorithm for moving object detection (MOD) in robot visual simultaneous localization and mapping (SLAM). This MOD algorithm is designed based on the defining epipolar constraint for the corresponding feature points on image plane. An essential matrix obtained using the state estimator is utilized to represent the epipolar constraint. Meanwhile, the method of speededup robust feature (SURF) is employed in the algorithm to provide a robust detection for image features as well as a better description of landmarks and of moving objects in visual SLAM system. Experiments are carried out on a handheld monocular camera to verify the performances of the proposed algorithm. The results show that the integration of MOD and SURF is efficient for robot navigating in dynamic environments. Keywords:simultaneous localization, and mapping (SLAM), moving object detection (MOD), moving object track ing (MOT), speededup robust features (SURF), monocular vision

1. Introduction In recent years, more and more researchers solve the simultaneous localization and mapping (SLAM) as well as the moving object tracking (MOT) problems concur rently. Wang et al. [1] developed a consistencybased moving object detector and provided a framework to solve the SLAMMOT problems. Bibby and Reid [2] pro posed a method that combines sliding window optimiza tion and leastsquares together with expectation maximization to do reversible model selection and data association that allows dynamic objects to be included directly into the SLAM estimation. Zhao et al. [3] used GPS data and control inputs to achieve global consis tency in dynamic environments. There are many advan tages to cope with SLAM and MOT problems simultaneously: for example, mobile robots might navi gate in a dynamic environment crowded with moving objects. In this case the SLAM could be corrupted with the inclusion of moving entities if the information of moving objects is not taken account. Furthermore, the robustness of robot localization and mapping algorithms can be improved if the moving objects are discriminated from the stationary objects in the environment.

* Correspondence: ytwang@mail.tku.edu.tw Department of Mechanical and ElectroMechanical Engineering, Tamkang University, Tamsui, New Taipei City 25137, Taiwan

Using cameras to implement SLAM is the current trend because of their light weight and lowcost fea tures, as well as containing rich appearance and texture information of the surroundings. However, it is still a difficult problem in visual SLAM to discriminate the moving objects from the stationary landmarks in dynamic environments. To deal with this problem, we propose the moving object detection (MOD) algorithm based on the epipolar constraint for the corresponding feature points on image plane. Given an estimated essential matrix it is possible to investigate whether a set of corresponding image points satisfy the defining epipolar constraint in image plane. Therefore, the epipo lar constraint can be utilized to distinguish the moving objects from the stationary landmarks in dynamic environments. For visual SLAM systems, the features in the environ ment are detected and extracted by analyzing the image taken by the robot vision, and then the data association between the extracted features and the landmarks in the map is investigated. Many researchers [4,5] employed the concept by Harris and Stephens [6] to extract appar ent corner features from one image and tracked these point features in the consecutive image. The descriptors of the Harris corner features are rectangle image patches. When the camera translates and rotates, the

© 2012 Wang et al; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Wanget al.EURASIP Journal on Advances in Signal Processing2012,2012:29 http://asp.eurasipjournals.com/content/2012/1/29

Page 2 of 22

scale and orientation of the image patches will be chan xk=f(xk−1,uk−1,wk−1)(1) ged. The detection and matching of Harris corner might wherexis the state vector;uis the input;wis the fail in this case, unless the variances in scale and orienkk k process noise. The objective of the tracking problem is tation of the image patches are recovered. Instead of to recursively estimate the statexkof the target accord detecting corner features, some works [7,8] detect the ing to the measurementzat time stepk, features by using the scaleinvariant feature transformk (SIFT) method [9] which provides a robust image fea zk=g(xk,vk)(2) ture detector. The unique properties of image features extracted by SIFT method are further described by wherevkis the measurement noise. A handheld using a highdimensional description vector [9]. How monocular vision, as shown in Figure 1, is utilized in ever, the feature extraction by SIFT requires more com this article as the only sensing device for the measure putational cost than that by Harris’s method [6]. To ment in SLAM system. We treat this handheld vision improve the computational speed, Bay et al. [10] intro sensor as a freemoving robot system with unknown duced the concept of integral images and box filter to inputs. The states of the system are estimated by solving detect and extract the scaleinvariant features, which the recursive SLAM problem using the extended Kal they dubbed speededup robust features (SURF). The man filter (EKF) [12] extracted SURF must be matched with the landmarks in xk|k−1=f(xk−1|k−1,uk−1, 0)(3a) the map of a SLAM system. The nearestneighbor (NN) searching method [11] can be utilized to match high dimensional data sets of description vectors.T T W(3 Pk|k−1=AkPk−1|k−1A+WkQk−1kb) k In this article, an online SLAM system with a moving object detector is developed based on the epipolar con T TT−1 H(H+V R V)(3c) straint for the corresponding feature points on imageKk=Pk|k−1k kPk|k−1Hk kk k plane. The corresponding image features are obtained using the SURF method [10] and the epipolar constraint xk|k=xk|k−1+Kk(zk−g(xk|k−1, 0))(3d) is calculated using an estimated essential matrix. Moving object information is detected in image plane and inte P=(I−K H)P grated into the MOT process such that the robustnessk|k kk k|k−1(3e) of SLAM algorithm can be considerably improved, parti wherexk|k1andxk|krepresent the predicted and esti cularly in highly dynamic environments where sur mated state vectors, respectively;Kkis Kalman gain roundings of robots are dominated by nonstationary matrix;Pdenotes the covariance matrix, respectively;Ak objects. The contributions in this article are twofold. andWkare the Jacobian matrices of the state equationf First, we develop an algorithm to solve the problems for with respect to the state vectorxkand the noise variable MOD in image plane, and then the algorithm is inte wk, respectively;HkandVkare the Jacobian matrices of grated with the robot SLAM to improve the robustness the measurementgwith respect to the state vectorxk of state estimation and mapping processes. Second, the and the noise variablevk, respectively. improved SLAM system is implemented on a handheld monocular camera which can be utilized as the sensor 2.1. Motion model system for robot navigation in dynamic environments. Two coordinate systems are set at the world frame {W} The SLAM problem with monocular vision will be and the camera frame {C}, as shown in Figure 2. The briefly introduced in Section 2. In Section 3, the pro state vector of the SLAM system with MOT in Equation posed algorithm of MOD is explained in detail. Some (1) is arranged as examples to verify the performance of the data associa tion algorithm are described in Section 4. Section 5 isT (4) x= [xCm1m2∙ ∙ ∙mnO1O2∙ ∙ ∙Ol] the concluding remarks. xCis a 12 × 1 state vector of the camera including 2. SLAM with a freemoving monocular visionthe threedimensional vectors of positionr, rotational SLAM is a target tracking problem for the robot system anglej, linear velocityv, and angular velocityω, all in during navigating in the environment [12]. The targets world frame;miis the threedimensional (3D) coordi to be tracked include the state of the robot itself as wellnates ofith stationary landmark in world frame;Ois j as of the landmarks and moving objects in the environthe state vector ofjth moving object;nandlare the ment. The state sequence of the SLAM system at timenumber of the landmarks and of the moving objects, stepkcan be expressed asrespectively.