failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

firesurfer · 2024-01-23T09:30:27Z

Bug report

I have a launch file where I launch a rather large amount of python nodes (~15-20). For some of those nodes I get an error like this:

[spawner-33] 1706000264.234912 [5]    spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error
[spawner-33] 
[spawner-33] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[spawner-33] This error state is being overwritten:
[spawner-33] 
[spawner-33]   'error not set, at ./src/rcl/node.c:262'
[spawner-33] 
[spawner-33] with this new error message:
[spawner-33] 
[spawner-33]   'rcl node's rmw handle is invalid, at ./src/rcl/node.c:433'
[spawner-33] 
[spawner-33] rcutils_reset_error() should be called after error handling to avoid this.
[spawner-33] <<<
[spawner-33] [ERROR] [1706000264.235045107] [rcl]: Failed to fini publisher for node: 1
[spawner-33] Traceback (most recent call last):
[spawner-33]   File "/opt/ros/iron/lib/controller_manager/spawner", line 33, in <module>
[spawner-33]     sys.exit(load_entry_point('controller-manager==3.21.2', 'console_scripts', 'spawner')())
[spawner-33]   File "/opt/ros/iron/lib/python3.10/site-packages/controller_manager/spawner.py", line 207, in main
[spawner-33]     node = Node("spawner_" + controller_names[0])
[spawner-33]   File "/opt/ros/iron/lib/python3.10/site-packages/rclpy/node.py", line 185, in __init__
[spawner-33]     self.__node = _rclpy.Node(
[spawner-33] rclpy._rclpy_pybind11.RCLError: error creating node: rcl node's rmw handle is invalid, at ./src/rcl/node.c:433

The reason I submitted this in the rclpy repository is that it only seems to happen for python nodes (perhaps because there are so many of it?) The exact nodes that fail during startup change between to runs.

Required Info:

Operating System:
- Ubuntu 22.04 - Podman container
Installation type:
- Binary
Version or commit hash:
- rclpy : 4.1.4
DDS implementation:
- rmw_cyclonedds_cpp: 1.6.0
Client library (if applicable):
- rclpy

Steps to reproduce issue

Have a launch file where you start many python nodes at once. In my case I have a lot of controller spawners from ros2control:

  servo_status_spawner = Node(
        package="controller_manager",
        executable="spawner",
        arguments=["status_controller_servo",
                   "--controller-manager", "/controller_manager"],
    )

   #And many many more

Expected behavior

All nodes should start.

Actual behavior

For at least 4-5 Nodes I get an:

[spawner-33] 1706000264.234912 [5]    spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error

(See above for full error log)

Additional information

As said above the setup runs in a podman container. I will test it this week in a native installation.
Environment settings:

export ROS_DOMAIN_ID=5
source /opt/ros/iron/setup.bash
export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
export ROS_AUTOMATIC_DISCOVERY_RANGE=LOCALHOST

The text was updated successfully, but these errors were encountered:

clalancette · 2024-01-23T18:06:42Z

This is likely another case of ros2/rmw_cyclonedds#458 .

firesurfer · 2024-01-24T08:04:14Z

@clalancette I can confirm this.

The solution presented in: ros2/rmw_cyclonedds#458 (comment)
worked for me.

The precise I had to use the first line:

export CYCLONEDDS_URI='<CycloneDDS><Domain><Discovery><ParticipantIndex>none</ParticipantIndex></Discovery></Domain></CycloneDDS>'

The second suggestion that also enables multicast didn't work for me as I then got the error message:

[spawner-32] 1706083268.665794 [5]    spawner: selected interface "lo" is not multicast-capable: disabling multicast
[spawner-32] 1706083268.667774 [5]    spawner: Failed to find a free participant index for domain 5

clalancette · 2024-01-24T13:12:17Z

@clalancette I can confirm this.

Thanks. I'm going to close this one in favor of that one.

firesurfer changed the title ~~rmw_create_node when spawning many python nodes at once from launch file with cyclonedds~~ failed to create domain error when spawning many python nodes at once from launch file with cyclonedds Jan 23, 2024

clalancette closed this as not planned Won't fix, can't repro, duplicate, stale Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

firesurfer commented Jan 23, 2024

clalancette commented Jan 23, 2024

firesurfer commented Jan 24, 2024 •

edited

Loading

clalancette commented Jan 24, 2024

failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

failed to create domain error when spawning many python nodes at once from launch file with cyclonedds #1212

Comments

firesurfer commented Jan 23, 2024

Bug report

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

clalancette commented Jan 23, 2024

firesurfer commented Jan 24, 2024 • edited Loading

clalancette commented Jan 24, 2024

firesurfer commented Jan 24, 2024 •

edited

Loading