feat(bridge_ros2): align REP-105 TF with Nav2 conventions without regressing direct-publish mode#135
Conversation
📝 WalkthroughWalkthroughAdds cached localization TF storage and a periodic rebroadcast mechanism. On localization updates the bridge computes and caches TFs (REP‑105 composition when used), immediately broadcasts the cached TF, and optionally rebroadcasts it periodically; REP‑105 rebroadcasts are future-stamped by transform_tolerance to avoid TF extrapolation. Changes
Sequence DiagramsequenceDiagram
participant Loc as Localization Source
participant Bridge as BridgeROS2
participant TFBuf as tf_buffer_
participant Cache as TF Cache
participant Pub as TF Publisher
participant Timer as Wall Timer
rect rgba(100, 200, 100, 0.5)
Loc->>Bridge: localization update (pose, scan_timestamp)
alt REP-105 mode
Bridge->>TFBuf: lookupTransform(odom, base_link, scan_timestamp)
TFBuf-->>Bridge: odom->base_link transform
Bridge->>Bridge: compose map->odom (REP-105)
else Direct mode
Bridge->>Bridge: use map->base_link directly
end
Bridge->>Cache: store computed TF + rep105 flag + scan_timestamp (mutex)
Bridge->>Pub: broadcast cached TF (stamp = now OR scan_timestamp / now + tolerance per mode)
end
rect rgba(100, 100, 200, 0.5)
Timer->>Bridge: periodic trigger (every transform_publish_period)
Bridge->>Cache: lock & retrieve cached TF
alt cached TF present && rep105
Bridge->>Pub: rebroadcast TF (stamp = now + transform_tolerance)
else cached TF present && direct
Bridge->>Bridge: skip rebroadcast (no re-stamping direct mode)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mola_bridge_ros2/src/BridgeROS2.cpp (1)
1640-1661:⚠️ Potential issue | 🟠 MajorDirect-mode compatibility promise not delivered — single stamping policy used for both modes.
This is the root cause for the behavior gap vs the PR description.
broadcastCachedLocalizationTfalways stamps withnode->now() + transform_tolerance, regardless of whether the cached entry is a REP-105map → odomor a directmap → base_linkentry. Consequences:
- Direct-mode TFs are no longer scan-time stamped — regression vs develop pre-#132.
publish_in_sim_timeis not honored for direct-mode TFs anymore (the stamp comes fromnode->now(), notmyNow(scan_timestamp)).- The rebroadcast timer at lines 237-254 keeps re-emitting direct-mode TFs with advancing stamps even when the localizer is paused — contradicts the PR test-plan item "Direct-mode Smoother: confirm … stamps stop advancing when localizer paused" and the compatibility statement "Direct-mode users: behavior restored to develop pre-#132 (scan-time stamping, no hidden rebroadcast, publish_in_sim_time respected)".
- The immediate broadcast triggered from
publishLocalizationTfat line 1637 inherits the same issue (downstream).To match the stated design, the cached entry needs to carry
useRep105and the scan timestamp, and this function needs to branch:🛠️ Suggested fix (requires the header change in the companion comment)
-void BridgeROS2::broadcastCachedLocalizationTf() +void BridgeROS2::broadcastCachedLocalizationTf(bool onlyIfRep105) { - geometry_msgs::msg::TransformStamped tf; + CachedLocalizationTf cached; { auto lck = mrpt::lockHelper(cachedLocalizationTfMtx_); if (!cachedLocalizationTf_) return; - tf = *cachedLocalizationTf_; + cached = *cachedLocalizationTf_; } + if (onlyIfRep105 && !cached.useRep105) return; - // Stamp at "now + tolerance" so consumers can lookupTransform(map, base, now()) - // without ExtrapolationException; Use the ROS clock so use_sim_time is honored automatically. - auto node = rosNode(); - if (!node) return; - tf.header.stamp = - node->now() + rclcpp::Duration::from_seconds(params_.transform_tolerance); + geometry_msgs::msg::TransformStamped tf = cached.tf; + if (cached.useRep105) + { + // REP-105 (map→odom): future-date so consumers can lookupTransform(now()). + auto node = rosNode(); + if (!node) return; + tf.header.stamp = + node->now() + rclcpp::Duration::from_seconds(params_.transform_tolerance); + } + else + { + // Direct mode (map→base_link): restore pre-#132 scan-time stamping, + // honoring publish_in_sim_time. + tf.header.stamp = myNow(cached.scanTimestamp); + } auto lckTfBc = mrpt::lockHelper(ros_tf_bc_mtx_); if (tf_bc_) { tf_bc_->sendTransform(tf); } }Then update the timer at lines 247 to
broadcastCachedLocalizationTf(/*onlyIfRep105=*/true);, and atpublishLocalizationTfpopulateuseRep105andscanTimestampalongsidetfbefore calling the immediate broadcast.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mola_bridge_ros2/src/BridgeROS2.cpp` around lines 1640 - 1661, The cached TF entry must carry a useRep105 flag and scanTimestamp and broadcastCachedLocalizationTf should branch on that flag: if useRep105, continue stamping with node->now() + params_.transform_tolerance (REP-105 behavior); if not (direct-mode), stamp using the cached scanTimestamp (preserving scan-time and publish_in_sim_time behavior) and do not advance stamps on rebroadcasts while paused. Update the cachedLocalizationTf_ structure (and access under cachedLocalizationTfMtx_) to include useRep105 and scanTimestamp, modify publishLocalizationTf to populate those fields when storing the cache and before calling broadcastCachedLocalizationTf, and change the periodic rebroadcast timer to invoke broadcastCachedLocalizationTf only for REP-105 entries (i.e., rebroadcastCached onlyIfRep105).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mola_bridge_ros2/src/BridgeROS2.cpp`:
- Around line 1650-1654: Reformat the tf.header.stamp assignment to satisfy
clang-format: adjust the expression using rosNode()/node->now() and
rclcpp::Duration::from_seconds(params_.transform_tolerance) so the line
breaks/spacing match clang-format-14 expectations (the statement involving
tf.header.stamp, node->now(), and params_.transform_tolerance should be
reformatted rather than changing logic); locate the assignment at
tf.header.stamp = node->now() +
rclcpp::Duration::from_seconds(params_.transform_tolerance) and apply
clang-format-compatible spacing/line breaks.
---
Outside diff comments:
In `@mola_bridge_ros2/src/BridgeROS2.cpp`:
- Around line 1640-1661: The cached TF entry must carry a useRep105 flag and
scanTimestamp and broadcastCachedLocalizationTf should branch on that flag: if
useRep105, continue stamping with node->now() + params_.transform_tolerance
(REP-105 behavior); if not (direct-mode), stamp using the cached scanTimestamp
(preserving scan-time and publish_in_sim_time behavior) and do not advance
stamps on rebroadcasts while paused. Update the cachedLocalizationTf_ structure
(and access under cachedLocalizationTfMtx_) to include useRep105 and
scanTimestamp, modify publishLocalizationTf to populate those fields when
storing the cache and before calling broadcastCachedLocalizationTf, and change
the periodic rebroadcast timer to invoke broadcastCachedLocalizationTf only for
REP-105 entries (i.e., rebroadcastCached onlyIfRep105).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3186d25d-7bb5-4be3-ae27-0d6a0eb7aa3e
📒 Files selected for processing (2)
mola_bridge_ros2/include/mola_bridge_ros2/BridgeROS2.hmola_bridge_ros2/src/BridgeROS2.cpp
There was a problem hiding this comment.
🧹 Nitpick comments (1)
mola_bridge_ros2/src/BridgeROS2.cpp (1)
1591-1616: REP-105 correction math is correct; one valid and one problematic polish suggestion.
lookupTransform(target=l.child_frame, source=params_.odom_frame, scan_tp)returnsT_baselink_odom, andtransform * odomOnBase_tf = T_map_baselink * T_baselink_odom = T_map_odom, which is what ends up stamped asmap -> odom. The scan-time sampling (vs. latest) is the right call to avoid biasing the correction by localizer latency, and the exception path correctly skips publishing rather than broadcasting empty frame_ids.One suggestion is valid, one is problematic:
tf2_ros::fromRclcpp(scan_stamp)expresses therclcpp::Time -> tf2::TimePointconversion more idiomatically than manually routing throughstd::chrono::nanoseconds— this is correct and recommended.
⚠️ Do not apply the arrow direction alignment. The error log at lines 1612-1614 correctly printsodom_frame -> child_frame(which isodom -> base_link), matching whatlookupTransform(l.child_frame, params_.odom_frame, ...)actually retrieves. The comment at line 1593 describes the desired mathematical form (base_link -> odom) but the code's direction is correct. Aligning the error log to match the comment would make the error message describe the wrong direction.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mola_bridge_ros2/src/BridgeROS2.cpp` around lines 1591 - 1616, The code correctly samples TF at scan time and logs the odom->child_frame direction; fix the minor API use by replacing the manual rclcpp::Time -> tf2::TimePoint conversion with the idiomatic tf2_ros::fromRclcpp(scan_stamp) when calling tf_buffer_->lookupTransform (symbols: mrpt::ros2bridge::toROS, tf2_ros::fromRclcpp, tf_buffer_->lookupTransform, tf2::fromMsg), and do not change the MRPT_LOG_THROTTLE_ERROR_STREAM message direction (keep params_.odom_frame -> l.child_frame) because that matches the actual transform retrieved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@mola_bridge_ros2/src/BridgeROS2.cpp`:
- Around line 1591-1616: The code correctly samples TF at scan time and logs the
odom->child_frame direction; fix the minor API use by replacing the manual
rclcpp::Time -> tf2::TimePoint conversion with the idiomatic
tf2_ros::fromRclcpp(scan_stamp) when calling tf_buffer_->lookupTransform
(symbols: mrpt::ros2bridge::toROS, tf2_ros::fromRclcpp,
tf_buffer_->lookupTransform, tf2::fromMsg), and do not change the
MRPT_LOG_THROTTLE_ERROR_STREAM message direction (keep params_.odom_frame ->
l.child_frame) because that matches the actual transform retrieved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b761e45-224f-4969-9fa1-ea34f65e0435
📒 Files selected for processing (2)
mola_bridge_ros2/include/mola_bridge_ros2/BridgeROS2.hmola_bridge_ros2/src/BridgeROS2.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- mola_bridge_ros2/include/mola_bridge_ros2/BridgeROS2.h
This brings the localization /tf publisher in line with the convention
used by AMCL, slam_toolbox, RTAB-Map and Cartographer, addressing three
issues observed when MOLA is consumed by Nav2-style downstream nodes:
1. REP-105 composition bug (fix). The inner factor of
map -> odom = (map -> base)(t_scan) * (base -> odom)(t)
was sampled at "latest" via waitForTransform(), which only supports
tf2::TimePoint{}. The two factors must be sampled at the same
instant or the published correction is biased by the odom-frame
motion accumulated during the localizer's processing latency. Switch
to tf_buffer_->lookupTransform(..., scan_tp). On lookup failure the
publish is skipped (an empty default-constructed TransformStamped
would inject TF_NO_FRAME_ID into every consumer's tf2 buffer).
2. transform_tolerance (new param, default 0.1s). The published stamp
is now node->now() + transform_tolerance so consumers can do
lookupTransform(map, base_link, now()) without
ExtrapolationException. Mirrors AMCL's transform_tolerance and
RTAB-Map's tf_tolerance. Uses the ROS clock so use_sim_time is
honored automatically.
3. transform_publish_period (new param, default 0.05s = 20 Hz). A
wall timer re-broadcasts the cached map -> odom independent of
localization update rate, keeping the TF buffer continuously fresh
even when the localizer runs slower than the consumer query rate.
Set to 0 to disable. Mirrors slam_toolbox's transform_publish_period
and RTAB-Map's tf_delay.
Backwards compat: to exactly preserve the prior (post-bugfix) stamping
behavior set transform_tolerance: 0 and transform_publish_period: 0.
publish_in_sim_time retains its meaning for all other publishers
(sensor TFs, odom messages, etc.); only the localization /tf stamp
source changed.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
mola_bridge_ros2/src/BridgeROS2.cpp (1)
234-256: Optional: also gate timer creation onpublish_localization_following_rep105.For users running strictly in direct mode, the periodic timer is still created (default 20 Hz) and wakes the executor only to early-return via
onlyIfRep105=true. Harmless, but you can skip scheduling it entirely in that case.♻️ Proposed tightening
- if (params_.transform_publish_period > 0.0) + if (params_.transform_publish_period > 0.0 && params_.publish_localization_following_rep105) { timerRebroadcastLocTf = rosNode_->create_wall_timer(Note: This assumes
publish_localization_following_rep105is effectively a one-time setting at startup (same read pattern used inpublishLocalizationTf). If you anticipate runtime flips, keep the current unconditional timer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mola_bridge_ros2/src/BridgeROS2.cpp` around lines 234 - 256, The timerRebroadcastLocTf is always created when params_.transform_publish_period > 0.0 even if publish_localization_following_rep105 is false, causing unnecessary wakeups that immediately early-return in broadcastCachedLocalizationTf(onlyIfRep105=true); change the condition to also check params_.publish_localization_following_rep105 (or the equivalent configuration flag used by publishLocalizationTf) so the wall timer is only created when both transform_publish_period > 0.0 and publish_localization_following_rep105 are true, leaving broadcastCachedLocalizationTf and the onlyIfRep105 argument unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@mola_bridge_ros2/src/BridgeROS2.cpp`:
- Around line 234-256: The timerRebroadcastLocTf is always created when
params_.transform_publish_period > 0.0 even if
publish_localization_following_rep105 is false, causing unnecessary wakeups that
immediately early-return in broadcastCachedLocalizationTf(onlyIfRep105=true);
change the condition to also check params_.publish_localization_following_rep105
(or the equivalent configuration flag used by publishLocalizationTf) so the wall
timer is only created when both transform_publish_period > 0.0 and
publish_localization_following_rep105 are true, leaving
broadcastCachedLocalizationTf and the onlyIfRep105 argument unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a67a933-48e6-40e9-b38e-4cec2a5ce6ca
📒 Files selected for processing (2)
mola_bridge_ros2/include/mola_bridge_ros2/BridgeROS2.hmola_bridge_ros2/src/BridgeROS2.cpp
|
@robin-zealrobotics When possible, let's both of us test this one to see if it actually works as you expected for rep-105, and I'll test it works for non rep105 too. |
|
Was just testing it and about to report :) (in sim, not on robot yet). Looks good. |
Supersedes / revises #132. Keeps the three Nav2/REP-105 improvements from the original PR, but scopes them so they no longer change behavior (or introduce new failure modes) for users who do not use REP-105.
MOLA supports two TF publishing modes for the localization result:
publish_localization_following_rep105:=True. The bridge publishesmap → odomand an external source (wheel odometry, robot_state_publisher, …) providesodom → base_link. Used byAMCL, Nav2, slam_toolbox, etc.
publish_localization_following_rep105:=False(default for the Smoother, and the documented choice for handheld / drone / no-external-odometry setups). The bridge publishesmap → base_linkdirectly; there is no second TF edge carrying motion.PR #132 improved REP-105 mode but applied the same logic to both modes, which regressed direct mode in two ways:
node->now() + transform_tolerancestamping replaced the formermyNow(l.timestamp)(scan-time) stamping for both modes. Direct-mode consumers that correlate the pose with the scan that producedit (offline post-processing, sensor fusion, visualization against recorded bags) started getting future-dated stamps and
publish_in_sim_timewas silently bypassed.map → odomstaying constant is exactly the right semantics becauseodom → base_linkstill moves. Indirect mode there is no
odom → base_link: a stalled localizer produces a frozen pose repeatedly re-stamped as "current", which downstream planners / RViz cannot distinguish from a robot that is actuallystationary and healthy.
Changes vs #132
CachedLocalizationTfstruct carries the cached TF plususeRep105and the original scan timestamp, so the broadcaster knows how to stamp each entry.broadcastCachedLocalizationTf(bool onlyIfRep105 = false):node->now() + transform_tolerance(unchanged Nav2 behavior from feat(bridge_ros2): align REP-105 TF publishing with Nav2 conventions #132).myNow(scan_timestamp)(restores pre-feat(bridge_ros2): align REP-105 TF publishing with Nav2 conventions #132 semantics and honorspublish_in_sim_time).onlyIfRep105=true, skipping direct-mode entries entirely. Immediate broadcast on a new localization update still fires in both modes.transform_toleranceandtransform_publish_periodupdated to state they apply to REP-105 mode only, and to explain why direct mode is excluded.Behavior matrix
publish_in_sim_timehonoredmap → odom)now() + tolerancetransform_publish_perioduse_sim_time)map → base_link)myNow(scan_timestamp)Compatibility
developbefore feat(bridge_ros2): align REP-105 TF publishing with Nav2 conventions #132 — scan-time stamping, no hidden rebroadcast,publish_in_sim_timerespected.Test plan
map → odomis future-dated and rebroadcast at 20 Hz (unchanged from feat(bridge_ros2): align REP-105 TF publishing with Nav2 conventions #132).map → base_linkis stamped at scan time andtf2_echoshows stamps stop advancing when the localizer is paused.publish_in_sim_time:=Truereplaying a bag: confirm TF stamps track the bag clock, not wallclock.transform_publish_period:=0: confirm TFs are only emitted on new localization updates.Original PR
This brings the localization /tf publisher in line with the convention used by AMCL, slam_toolbox, RTAB-Map and Cartographer, addressing three issues observed when MOLA is consumed by Nav2-style downstream nodes:
REP-105 composition bug (fix). The inner factor of map -> odom = (map -> base)(t_scan) * (base -> odom)(t) was sampled at "latest" via waitForTransform(), which only supports tf2::TimePoint{}. The two factors must be sampled at the same instant or the published correction is biased by the odom-frame motion accumulated during the localizer's processing latency. Switch to tf_buffer_->lookupTransform(..., scan_tp). On lookup failure the publish is skipped (an empty default-constructed TransformStamped would inject TF_NO_FRAME_ID into every consumer's tf2 buffer).
transform_tolerance (new param, default 0.1s). The published stamp is now node->now() + transform_tolerance so consumers can do lookupTransform(map, base_link, now()) without ExtrapolationException. Mirrors AMCL's transform_tolerance and RTAB-Map's tf_tolerance. Uses the ROS clock so use_sim_time is honored automatically.
transform_publish_period (new param, default 0.05s = 20 Hz). A wall timer re-broadcasts the cached map -> odom independent of localization update rate, keeping the TF buffer continuously fresh even when the localizer runs slower than the consumer query rate. Set to 0 to disable. Mirrors slam_toolbox's transform_publish_period and RTAB-Map's tf_delay.
Backwards compat: to exactly preserve the prior (post-bugfix) stamping behavior set transform_tolerance: 0 and transform_publish_period: 0. publish_in_sim_time retains its meaning for all other publishers (sensor TFs, odom messages, etc.); only the localization /tf stamp source changed.
Summary by CodeRabbit
New Features
Bug Fixes