Ergodic Exploration – Boston Cleek

Overview

The goal of this project is to provide a robot agnostic information theoretic exploration strategy for known and unknown environments in the form of a ROS package. The exploration strategy requires a target distribution. The target can be either a user specified distribution representing the expected information gain or mutual information. In the case of the user specified distribution the target is modeled as a Gaussian or multiple Gaussians.

See the code in my GitHub repository.

Exploration

Ergodic Control

Ergodicity is the fraction of time spent sampling an area should be equal to a metric quantifying the density information in that area. The ergodic metric is the difference between the probability density functions representing the spatial distribution and the statistical representation of the time-averaged trajectory [1]. The objective function includes the ergodic metric and the control cost.

The ergodic controller performs receding horizon trajectory optimization in real time. The real time performance is achieved by integrating the co-state backwards in time [2].

In both the target information density and the mutual information demonstrations below rtabmap_ros was used for localization and mapping using a Velodyne lidar. Odometry was performed using wheel encoders.

Target Information Density

Two targets representing the information density are modeled as Gaussians are shown in yellow. The optimized trajectory from the erogdic controller is show in red. The trajectory oscillates from side to side but this behavior is expected. The predicted trajectory from the dynamic window approach is show in blue.

See the full video in real time.

Here is a gif showing the robot exploring a target and maneuvering around an obstacle.

The resulting path from the exploration stack is shown in blue and the small white lines represent local loop closures. The robot was observed visiting each target several times and exploring the corridor to the left of the targets.

Mutual Information

In the case of the occupancy grid mutual information is the expected information gain at each grid cell. The mutual information is computed using the Fast Continuous Mutual Information (FCMI) algorithm presented in [3]. This is accomplished by simulating a 360 degree range finder.

below the robot explores the atrium using mutual information as the target distribution show in the upper left. The dark areas represent more mutual information. After the robot has fully explored the space the mutual information is zero.

See the full video in real time.

The occupancy grid and mutual information map are on the left and right respectively. The robots path is shown in blue and the small white lines represent local loop closures.

Exploration Stack

The ergodic controller cannot guarantee a collision free trajectory. The dynamic window approach is used as the local planner. The next twist from ergodic controller is propagated forward in for a short time to detect collisions. If there is a collision the dynamic window approach is used. The dynamic window approach forward propagates a constant twist for a short time and disregards it if there is a collisions.

In the case where the ergodic control results in a collision the dynamic window approach uses the optimized trajectory as a reference. The dynamic window approach finds a twist that will produce a trajectory most similar to the reference that is collision free. Similarity between the optimized trajectory and the dynamic window approach trajectory is defined as the distance between the robots x and y position and the absolute difference in heading at each time step. The twist produced by the dynamic window approach is followed for a number of steps to ensure the robot is clear from any obstacles.

It is possible that the twist from the dynamic window approach can cause a collision in a dynamic environment. In this case the dynamic window approach will search the velocity space for the most similar collision free twist to the previous twist given by the dynamic window approach. Similarity is defined here as the inner product of the twist. If the dynamic window approach fails in this case a flag is set for the ergodic controller to replan. The ergodic controller is sensitive to the pose of the robot and is capable of optimizing a different trajectory. This strategy prevents the robot from getting stuck.

Collision Detection

To detect obstacle cells in the occupancy grid Bresenham’s circle algorithm is used. The robot’s base is modeled at a circle. If there are obstacles within a threshold of the robot’s bounding radius the robot is considered to be in a collision state.

below the robot was given a control signal to drive forward. This control sends the robot on a collision course with two obstacles. The dynamic window approach finds a twist capable of maneuvering the robot around each obstacle.

Credits

[1] L. M. Miller and T. D. Murphey, “Trajectory optimization for continuous ergodic exploration,” in American Control Conference, 2013, pp. 4196–4201. PDF

[2] Decentralized ergodic control: Distribution-driven sensing and exploration for multi-agent systems I. Abraham and T. D. Murphey IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 2987–2994, 2018. PDF, Video

[3] An Efficient and Continuous Approach to Information-Theoretic Exploration T Henderson, V Sze, S Karaman 2020 IEEE International Conference on Robotics and Automation (ICRA), 8566-8572, 2020. PDF, Video