Created by [ Rowan Dempster], last modified by [ Henry Wang] on May 13, 2020
The current goal of HD Mapping is to extract static environment information (e.g lanes) from the HD Map and output it. For some reason, HD Mapping is has fallen into this group's domain. I (Henry Wang{.confluence-userlink .user-mention}) suspect that this is because the initial application of the HD Map was to provide redundant and interchangeable (with Perception detections) lane data to Path Planning. That's certainly one way to think of it, but the concept of mapping is more closely related to routing and Path Planning. As such HD Maps were not really considered when designing the Path Planning architecture, which considers it as supplementary. Indeed, one of the critical objectives of the competition (SAE has emphasized this prior to year 2) is to make use of the HD Map. Sadly, this work has not been prioritized very well and progress has been very limited (as of Jan 2020).
Normal maps (like Google Maps) are known as Standard Definition (SD) maps. They typically only represent road and intersection information, and are used for global-level routing (e.g. using A*). On the other end of the spectrum, there are High Definition (HD) maps. Although there is no rigorous standard or definition for what qualifies as HD, HD maps contain more fine grained data, such as coordinates of lane lines, stop lines, and traffic control objects (e.g. traffic signs and lights), and other regulatory attributes (e.g. speed limits).
The competition organizers (SAE) have provided teams with an HD Map of the MCity. Using HD Map data will be crucial for competition success. Naturally, complete reliance on Perception is not feasible for Path Planning to plan at all of its levels.
The two HD Map formats we work with are HERE and OSM Lanelet.
If there were only a single map format, things would be simple, the map file is parsed into an internal representation. This internal representation refers to how the data is stored in program memory. This data structure would be closely derived from whatever input format (e.g. HERE HD or OSM Lanelet), regardless of the implementation, whether it's in the form of objects or nested dictionary, it's still ultimately an abstract syntax tree (AST). Queries about the local static environment can be made doing some work on the internal map, which ultimately produces some predefined output (e.g. ROS messages). Given the pose (i.e position and orientation) of the vehicle, query the map for:
The complication of our system design arises from the need to simultaneously support 2 map formats, OSM Lanelet Map, and HERE HD Map. The 2 map formats share common concepts in how they represent HD map elements, for example the concept of polylines with identifiers. Several options have been considered. Obviously, the most straight forward way is to have separate parse-query pipelines for both formats. Another idea was to convert both maps into an intermediate JSON format file, which would then be read by the publisher. A revised solution was to convert the HERE HD Map into OSM Lanelet Map format. A quick examination of the 3 solutions will demonstrate that the last one is the best approach.
A parse-query pipeline is easy to visualize. The OSM file is parsed into its corresponding OSM Lanelet Abstract Syntax Tree (AST) - think of classes that correspond to Nodes, Ways, Lanelets in program memory, then the query functions operate on this AST. Likewise, the HERE map is also parsed into its own HERE Map AST (likewise, think of LaneGroups and Lanes), and the same query functions would be implemented to operate on the slightly different ASTs. Obviously, it would be wasteful to have a parse-query pipeline for each map format, because there would be duplicated query code for both ASTs. Although some of this code may be very similar and can be extracted, it is certainly an unnecessary development and maintenance overhead to work with 2 sets of query code. Furthermore, if the queries or output change, then modifications will be required both code sets. Naturally, a better solution is to convert the OSM AST and HERE AST into a common AST, such that the query code is only implemented to operate on this common AST. The design of this custom and common internal representation is the crux of the problem.
The design of a custom common internal representation is a good idea, and there is a prototype. It was proposed that this structure be implemented in program memory in JSON format. The HERE Map and OSM Lanelet Map would both be parsed into this JSON format and then serialized to an intermediate JSON map file. This JSON map file would then be parsed and queried. Only one set of query code would be required to work on the JSON file. This is an improvement and eliminates code duplication, but the main issue is that it introduces a third map file format. The existence of a third file format increases complexity, causes more concept duplication, requires its own parser, and is redundant. Notice that the conversion of OSM to a JSON format is not particularly helpful because it encodes the same information, just in a different format. Recall that OSM is an XML format, which represents trees, JSON format also represents trees. The idea of a common AST for both map formats is agreeable, but using an intermediate map file is not.
The optimal solution is to convert both map formats into a common internal representation and operate directly on it. It's actually not necessary to come up with own custom AST, as we can just use the OSM Lanelet format, it's complete as it already represents everything of concern (i.e. lane boundaries and connections), it is commonly used (e.g. UofT), and it is easily extendable to support other constructs. So the OSM map file will be parsed into the OSM Lanelet AST as usual. The HERE map file will be parsed into its own HERE AST, as usual, then the HERE AST will be converted into the OSM Lanelet AST. The query code operates only on the OSM Lanelet AST. This reduces the complexity of development because there are only 2 ASTs that we have to deal with, not 3, and there is no intermediate custom map file format that we have to support. There also only needs to be one set of query code that operates on the OSM Lanelet Map, so there's no code duplication. Another benefit of doing this is that it allows us to use JOSM to inspect and modify the transformed HERE map. Since we don't have access to MATLAB's HERE HD Map toolbox, we currently have no good way of visualizing HERE Maps besides for mediocrely hacked visualization solutions.
The proposed pipeline is shown by the sequence of parcels and transformations. The terms in square brackets are actions, and the other other terms are data formats.
HERE HD Map → [ Parse HERE ] → HERE AST → [ Convert OSM ] → OSM AST
→ [ Serialize OSM ] → OSM Lanelet Map → [ Parse OSM ] → OSM AST → [ Query ] → ROS Message
Readers with experience in compilers will understand the subtle difference between parsing and transforming. Parsing refers to the construction of an AST in program memory from some other memory format (e.g. a text file) without modifying the core data representation. Transforms are operations that modify the AST. One might ask why we can't just go from HERE HD Map to OSM Map in one step instead of all these parse and transform steps. The answer is to follow proper abstraction patterns and avoid convoluted code. This flow is exactly how compilers work. A program is not optimized in a single step. First the source code is parsed into an AST that cleanly represents the program, then various transformations are applied on the AST to produce another AST, before serializing to bitcode.
Currently, the design problems that remain are the definition of the HERE Modified AST, the transforms surrounding it, the design of the Query step, and the definition of the ROS Message outputs. The other operations require little design and are trivial to implement.
There are certainly limitations with this architecture. The feasibility of this system relies on the upper size of the map. If the map is too big, it may not be reasonable to load it into main memory making this query architecture fail. It is unknown what this limit is. For size reference, the firetower.osm file occupies 1.2 MB, and represents 7k Nodes, 86 Ways and 32 Lanelets.
Each stage and transform is discussed more in detail in their respective sections.
This data format refers to the uncompressed text files that represent the various layers of HERE HD Maps. For example raw files for the TRC can be found on Google Drive. The files are provided as protobuf binaries and proto files for each tile's layer.See the relevant Confluence page section, and official developer guide for more information.
The first step is to implement code to extract information from the relevant layers. Currently, we are concerned with parsing the following layers:
We actually got baited into writing our own parser creating our own AST classes because our predecessors failed to read the README that demonstrated how protoc can be used to code-gen parser and AST source code for the various HERE HDM Layers. We thought that we were restricted to working with these plaintext files because that's what we saw committed to the repo and that's what they had been working on for a year. Their first step had been to decode protobuf binaries to plaintext because it makes sense that we (humans) want to see the data. This was done using protoc.
In late February, Charles pointed out (after reading the README that came with TRC data) that protoc had command line options to compile proto headers into class and parser code, in any language. Anyways, we ended up wasting quite a bit of time on this, a whole year as well as hours of my time. Since we have so many layers and tiles, I just made a python script to code-gen everything, apparently people were just manually invoking protoc a dozen times. Use this script to decode protobuf binaries to text files and/or compile proto headers into source code, remember to download the protobuf files first.
We got baited into defining our own classes here when we could have just used Protobuf's code gen. The target language can be anything, we're using Python, but C++ source code is also possible. Python code is here. I wouldn't bother looking too deep into it, just know how to use it and the relevant classes.
The magical step we need to do is to transform relevant information into some OSM format. This may require us to create custom OSM Relations, XML attributes and elements. This transformation is going to be very challenging because it essentially involves aggregating data from multiple HERE map layers. I'm basically done this.
The additional challenge here was to merge multiple tiles. TRC spans more than 1 tile unfortunately.
Currently, this file lives here.
See the relevant Confluence technical documentation for details on the OSM Lanelet Map format we use. We don't design the core of this format, but it may be enhanced. The internal class we use to represent OSM is of course based off the provided standards. No significant design work here.
Serialization is the opposite of parsing. It's a transformation from AST to string. This is necessary because we would like to work with the HERE converted OSM map. For example we would like to load it in JOSM for inspection and amendment. Thankfully, this step is very easy to do and required no design work.
Since OSM maps are essentially encoded as an XML file, we can use third party libraries. This task is quite easy to do and requires no significant design. The main module and functions are in this file.
The current set of queries are mentioned in the section above. The crux of the problem is how to quickly look up the current lanelet that the vehicle is traversing on. Given the current lanelet, upcoming and adjacent lanelets can be easily looked up because of the convenient OSM Lanelet structure, as well as any associated static traffic objects.
Currently, there is this requirement to return the points of lanelets that are within some radial distance. This simulates what perception algorithms would see, and is an acceptable requirement. The most obvious and naive implementation to would be to do something like this:
ret = []
for way in ways:
node = find_closest_node(way)
if within(node, 20):
ret += ways
return ret
This algorithm scales linearly with the number of Nodes in the OSM Lanelet Map. For scale reference, the firetower.osm has about 7000 nodes. It's still uncertain how many Nodes the OSM Lanelet Map of MCity will have. Perhaps this processing won't take very long on the Rugged and this runtime will not be an issue. In any case, the crux of this problem is to design a more sophisticated lanelet lookup algorithm.
Core implementation is found here.
The final stage is to convert the OSM ASTs into appropriate WATO ROS messages for Path Planning to consume. Naturally, the custom ROS messages would have a structure very similar to how OSM and HERE represent drivable roads. The transmission protocol is challenging because we would need to design an appropriate data structure to represent HDM features.
See this important ROS msg header. The conversion is done at.
I (Henry Wang{.confluence-userlink .user-mention}) am uncertain of the integrity of the content below.
Note: the following information is from Guy Stoppi's write-up at the end of April 2019. This may have to be rearchitectured as we require a new HD Map interface.
The lane publisher node loads the lane lines (which are parsed using the method described in the HERE Maps phab doc) and searches through them to find the closest ones (all lanes within a certain radius). If these lane lines are artificial lane lines from an intersection, it will tag them as "turning left", "straight", or "turning right".
Searching through the lane lines is pretty simple: find the lane line's
closest point to the car and, if the point is within a certain radius,
return the lane line.
Tagging them takes more work. Although it's possible this methodology
will change, the current method requires the lane lines to be translated
into the car's local coordinate system.
The flow for tagging the direction of lane lines is as follows:
Details are outlined below:
The math behind this uses matrix multiplication and 2D rotational matrices. Suppose we have two 2D coordinate systems: the global one, G, and the local one, C. In our HD map case, G is the coordinate system for the HD map and C is the car's coordinate system. For now, we assume that C's origin is always at G's origin and so C is only different than G by its rotation Θ. So we have that, for a point L = (Lx, Ly) defined in C and is equivalent to the point P = (Px, Py) in G:
[{.confluence-embedded-image
.confluence-external-resource}]
\
The first matrix was obtained by the definition of a 2D rotational
matrix and the second was obtained by taking the inverse of the first.
So this allows us to translate a point in a lane line to the car's
coordinate system IF the car is at (0,0). To get around this condition,
we simply subtract the car's map coordinates from the lane point. This
will translate the lane point into a coordinate system that isn't
rotated relative to the global map's coordinate system but has the car
at (0,0). We then apply the above pictured transformation to
get the lane point in the car's coordinate system.
Once we have all of the lane points in the car's coordinate system, we can observe how the lane is defined relative to the car to describe it as "turning left", "straight", or "turning right". We do this by taking the closest lane point and the furthest lane point and comparing how far to the left/right they are.
If the furthest lane point is much further to the left than the closest lane point, then we say the lane "turns left". If the furthest lane point is much further to the right than the closest lane point, then we say the lane "turns right". Otherwise, we say the lane is "straight".
We collect all the lanes (which are tagged if they're in an intersection) which are within a certain radius of the car and publish them.
Based on the above document, it seems that all the processing and tagging was being done in real-time, as the car moved. This was due to the fact that we only used 1 Map provider back then
However, we now require the ability to 'plug and play' different map formats. We need to create some sort of mapping interface
The planned HD Map interface is discussed here
Thus, we would have to parse the HERE HD Map, or OSM Map, into a custom internal and common structure that we could simply query against.
We would need something like the following interface:
[{.confluence-embedded-image
.transparent
.confluence-external-resource}]
[Converting HERE to OSM would allow us to use a graphical tool like JOSM if need be. ]
\
[OLD: F19 Plan:]
[{.confluence-embedded-image
.confluence-external-resource}]
Processing will be passing Path Planning lanes from an HD map.
[{.confluence-embedded-image
.confluence-external-resource}]
\
The lanes and will be numbered in some order from negative (left road boundary) to zero (centre yellow line of the road) to positive (right road boundary). Numbered lanes allow us to keep track of what lane we are in and how many possible total lanes there are. This way we Path Planning can easily change lanes.
Path planning and Processing integration document: here
Processing will pass intersection lane lines for every possible action (left, right, straight). The only lines we want to give Path Planning will be valid paths that the car can follow. IE: only paths and lanes stemming from our current intersection entrance. This is not currently finished, but is one of Processing's next goals.
See
[[[OPENHDMAPS_InfoSheet
(1).pdf][]{.companion-edit-button-placeholder
.edit-button-overlay data-linked-resource-container-id=”1442352”
data-linked-resource-id=”1442354”
data-template-name=”companionEditIcon”
data-source-location=”embedded-attachment”}]
\
April 2019 summarization document by Guy Stoppi:
[[[Localization_Document
(1).pdf][]{.companion-edit-button-placeholder
.edit-button-overlay data-linked-resource-container-id=”1442352”
data-linked-resource-id=”1442355” data-template-name=”companionEditIcon”
data-source-location=”embedded-attachment”}]
\
\
[[[]{.companion-edit-button-placeholder
.edit-button-overlay data-linked-resource-container-id=”1442352”
data-linked-resource-id=”1442356” data-template-name=”companionEditIcon”
data-source-location=”embedded-attachment”}]
[[edwardchao_masc_thesis-compressed.pdf][1
MB]{.phabricator-remarkup-embed-layout-info
style=”color: rgb(146,150,157);”}]
\
\
Processing Leads of F19 had a meeting with Guy Stoppi at the beginning
of the term to understand how 'localization' fits into the software
pipeline:
notes
here
Meeting notes with Guy and Shigeki on Nov 6 2019: found
here
This contains notes on the current state of lane publisher, how the HERE
maps were parsed before etc.
OPENHDMAPS_InfoSheet (1).pdf
(application/pdf)
Localization_Document (1).pdf
(application/pdf)
edwardchao_masc_thesis-compressed.pdf
(application/pdf)\
Document generated by Confluence on Dec 10, 2021 04:02