-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service clients freeze on multi-client cases. #372
Comments
Additional information (Thanks @wuisky)The issue of sending response to all clients. @fujitatomoya Easy solution is to change default qos parameters from KEEP_LAST to KEEP_ALL. rmw/rmw/include/rmw/qos_profiles.h Line 67 in cf6b0dd
|
I tried with your reproducible examples both service server and client, and no process freeze are observed with my environment. # service server
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclpy rmw_372_server --ros-args --log-level debug
... <keeps logging>
# service client (print once in a thousand)
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclpy rmw_372_client
[INFO] [1725490218.437069421] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3
[INFO] [1725490244.581709735] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3
[INFO] [1725490324.383530299] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3
[INFO] [1725490461.426458776] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3
[INFO] [1725490658.204459264] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3
[INFO] [1725490919.450157012] [minimal_client_async]: Result of add_two_ints: for 1 + 2 = 3 the above data tells me that it increases the response time on service server significantly as time goes.
Can you rephrase
I do not think so. Because your sample client calls
Could you tell us why you think ros2/rclcpp#2397 is the root cause of this? As mentioned above, server deals with 3 request at most. As far as i see, this is rclpy executor performance problem. |
Ref: ros2/rmw#372 Signed-off-by: Tomoya Fujita <[email protected]>
@fujitatomoya I agree the current QoS depth is enough if the clients runs their loops equally. The increase of response time may other issue. |
I investigated the behavior of service response lost. The increase of loop period may cause the lost of service response. I will also look into the cause of the increase of loop period |
@kjjpc really appreciate your effort on this issue. i am now interested in this issue that it is losing some response messages in the clients... if that is the RMW implementation problem, would you try https://github.com/ros2/rmw_cyclonedds (or https://github.com/ros2/rmw_connextdds) to see if they have the same problem with https://github.com/ros2/rmw_fastrtps? if cyclone and connextdds can keep the high frequency rate between clients and service server, we could say this issue with https://github.com/ros2/rmw_fastrtps? that is not finding the root cause yet, but i think worth to take a shot. |
@fujitatomoya Thanks for you reply. I thinks the slow down of spin loop is a issue in rmw or rcl, and the lost of service responses is a issue of current architecture and QoS parameters. |
Bug report
Steps to reproduce issue
I run 1 service server and 3 service clients which call service on high frequency as a stress test.
Attached are python codes, but same result on c++ codes.
Service Server
Service Client (run on 3 terminals)
Expected behavior
Continue to display results from service server repeatedly
Actual behavior
The service client sometimes freezes in spin_until_future_complete. The service client doesn't progress at all when it freezes. Even though 2 of 3 clients are terminated, the freeze client doesn't continue the computation loop.
The freeze occurs not only high frequency cases but also low frequency multi-client cases.
Additional information
This issue is reported on ros2/rcl#1163.
The freeze is resolved by changing qos parameter like following.
The default qos parameter of service is keep last 10 data, and it is not enough for multi-client cases.
In ROS2, all service response from service server is published to all clients, it means the buffer can be filled up by the responses for other callers and the response for the caller can be lost.
The default qos parameter of service client is not suitable for multi-client cases, which are often the case in ROS2. The default qos parameters should be changed from KEEP_LAST to KEEP_ALL.
The parameter is defined in qos_profiles.h.
rmw/rmw/include/rmw/qos_profiles.h
Line 66 in cf6b0dd
The text was updated successfully, but these errors were encountered: