watonomous.github.io

[ Software Division : Object Tracking ]

Created by [ Rowan Dempster], last modified by [ Henry Wang] on Mar 15, 2020

Object Tracking

Tracking an object of interest's (OI) position over time is difficult because

The estimated positions of a detected OI often lack accuracy and precision.
Because of the aforementioned low fidelity signals, individual detection must be associated with an individual objects. This becomes difficult when the environment has multiple objects moving in close proximity to each other (this problem is called "Data Association").

Imprecise data

Processing detections using a model of the environment before sending them to Path Planning is important because of the uncertainty associated with data from Perception. Consider what would occur if Perception forwarded data directly to Path Planning, and failed to detect an upcoming dog for a few frames. Path Planning would plot a route through the dog!

Benefits of Object Tracking

• We can determine precise velocities of every detected object
• Statistics allows us to convert the series of imprecise object detections into fairly precise ones.
• Identification and removal of False Positive detections.
• We can maintain a high publishing frequency to accommodate for Perception’s slower detection algorithms. How? Since we know the velocities of objects, we can output predictions of their locations at timesteps without input.
• We can use our predictions to accommodate for missed detections caused by obstruction, perception failure or other limitations.

Currently there are 3 types of OI to be tracked, stop lines, road lines, and obstacles. By successfully tracking these OIs, Prediction can then create an accurate representation of the current local vehicle environment (LVE) to be used by Path Planning.

Current State of Prediction

During the Fall 2018 term we built Kalman-Filter based multi object tracking. Demo is here https://drive.google.com/open?id=1uzQW2Tt8aWVrZgbWp5CCJ9tQOkIabIiX .
The only problem is that we had to comment out all of the POV -> Novatel conversion code (see below for a more in-depth explanation about this). This was necessary because the conversion requires pose data about the car, and our current dummy publisher node which we use for testing does not support pose generation.
Before this node is placed in a moving car, the code must be uncommented!

Document about the prediction repo that is probably out of date and partially incorrect: https://docs.google.com/document/d/1onntJbCP7zJutpyJ-3ZL6saYip6Dmqhu52d1gBBHwZs/edit

During the Winter 2018 term, we were able to bring up the basic framework of the Prediction module and integrate it with the rest of the system. That is, the code for message handling, OI internal storage, coordinate conversions, and environment output has been implemented. However, none of the tracking objectives mentioned above have been met, as a lot of work needs to be done on the algorithms. The biggest issue with the development of Prediction’s algorithms is the process of testing, on a module level, and on a unit test level. This is because Prediction relies on properly synchronized data coming from various upstream modules, which is generally hard to acquire.

General Processing Module Architecture

The Processing module runs as a single threaded ROS node process. The ROS API let’s us easily set up callback functions to handle incoming messages that are queued.

Publishing Mechanism and Message Handling

In general, the data describing OIs of interest come in as asynchronous inter-process messages from other team modules, and the position data of the OIs are typically encoded in the vehicle POV coordinate system. Path Planning expects OI positions to be in the Novatel coordinate system (see the Integration page mentioned above). Therefore, at some point between receiving and publishing, a conversion of coordinate systems is necessary. The utility functions to convert between coordinate systems have already been implemented.

Currently, Path Planning requests Environment messages to be published at around 30 Hz. Rather than using a timer based callback to publish the message, the publishing is set up to be triggered by the reception of Odometry messages from Sensor Fusion. Timer callbacks are avoided because it’s awkward when rosbags are played frame by frame. Odometry is published at 50 Hz, but is throttled on receive such that Prediction only handles (currently) half the messages. Odometry contains the newest pose of the vehicle, so this causes all OIs being tracked in POV to be updated, and filtered for tracking bounds, before being converted to Novatel coordinates and published.

Meanwhile, new detections which come in POV are matched and merged with their tracked counterparts.

OI Tracking in the LVE

The “local” part of LVE indicates that we are only concerned with OIs within some finite region of interest around the vehicle. Currently, this region is envisioned as the following rectangular box which is aligned with the axis of the vehicle POV coordinate system [insert image]. This area describes a tracking boundary.

A lot of Prediction functionality relates to maintaining which OI are in this tracking boundary. OIs need to be removed from the LVE otherwise a lot of redundant data would accumulate and be sent to Path Planning. The current implementation stores the position of these OIs in the vehicle POV coordinate system because it’s a one step process to check for qualification of an OI in the LVE (as it's currently defined), and a lot of work related to tracking road lines is a easier this way.

The issue with this design option is its scalability and performance impact, static objects change coordinates (POV) between frames when the vehicle is in movement, so a lot of compute cycles are used to update their POV coordinates and do tracking bound filtering. The alternative method of storing OI in Novatel coordinates was considered. The difference here is that the position needs to be converted to POV before checking for tracking bounds, with the current definition of the LVE ROI.

This was a design decision I made and is straight up stupid now that I think about it. If Odometry comes in at 25 hz, and detections at 1 Hz, Prediction just wastes compute cycles updating POV coordinates when its Novatel coordinates could be stored, which doesn't need to be updated based on vehicle pose.

The alternatives would be to define the LVE ROI to be a cardinal direction aligned box, such that it would be independent of the vehicle’s heading. And have the internal representations of OI positions be in the Novatel coordinate system. Incoming detections would be immediately converted from POV to Novatel coordinates, and checking for tracking boundaries would be simple. However, matching and merging would then have to be done in the Novatel coordinate system, which is maybe how it should be done.

OI Tracking

The current functional status, and high-level description of the tracking software for the 3 types of OI are outlined below.

Stop lines are received from Perception, whose message only describes the single closest stop line detected. Therefore, Prediction is only tracking one stop line at a time, but this should be easily extendable to handling multiple. As the vehicle approaches the stop line, the stop line will be out of camera view, so Perception will cease to publish messages for that particular stop line. Naturally, Prediction is able to track this and still output the stop line data to Path Planning because it is aware of its last seen (global) position. However, there is no error checking of incoming stop line detections, because it never rose as an issue, but this feature should exist.

Tracking road lines is much more involved, and there was greater uncertainty with their incoming detections. Similar to stop lines, the segments of the road line closest to the vehicle’s front bumper go out of view and are amended by Prediction. The other major process involved is line matching, that is, creating a correspondence between a detected line and an internally stored line and merging the two lines. There is also no proper error detection scheme being implemented, as we naively assume incoming Perception data is valid.

The lidar sensors create 3D point clouds of the surrounding environment and obstacles are received by Prediction in the form of 3D bounding boxes. There is code to handle this reception, but not a lot of work has been done on this OI because it wasn’t required for year 1. It’s also not obvious whether or not any kind of error detection needs to be implemented, so it’s currently being directly forwarded to Path Planning. But we can foresee that lidar data could be fused with computer vision to validate certain types of detections; although this should really be a Perception aspect.

Lane Matching

Road line objects are represented as a ordered list of points, ordered in the sense in which the vehicle travels past them. The current matching implementation increases the point per meter density to 10 / m, before trying to do matching. The matching process compared a detected road line with all the currently tracked road lines. The matching metric for two road lines is determined by examining a bounded segment from each road line and evaluating their “closeness”. For some point in the detected segment, if there is a point in the tracked segment that is within some proximity threshold, that point is classified as being matched. If sufficient points within the segments “match”, then the detected road line is classified as a match with the tracked road line, and they are merged.

Performance analysis has not yet been done but this is clearly naive and not efficient. And the solution should be generic towards straight lines and curved lines.

Other Thoughts

We need:

A testing/characterization framework for Perception's algorithms and at least a way to quantify detection performance. We get blocked by Perception so we might need to lend them a hand.
A method to estimate the radius of curved road lines.
A method to guess the extension of road lines that fail to be detected.
A method that checks for "parallelness" between road lines.

Document generated by Confluence on Dec 10, 2021 04:02

Atlassian