-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(newlib): usleep returning early (IDFGH-14342) #15132
base: master
Are you sure you want to change the base?
fix(newlib): usleep returning early (IDFGH-14342) #15132
Conversation
👋 Hello stevenoonan, we appreciate your contribution to this project! 📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more. 🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project. Click to see more instructions ...
Review and merge process you can expect ...
|
Aren't these changes the same as the described changes to C++'s |
The change I'm proposing with this PR is the same method Let me play out some examples to clarify what is going on. With a system tick period of 10ms, calling In IDF 5, calling So what options do we have? If we make So if we must modify
Option (1) is a non-starter for me. We need a supported way of sleeping accurately for small time periods. Plus, it would break a lot of code currently in the field. Option (2) works, but will often sleep longer than we needed to. With a 10ms system tick, a 15ms sleep sometimes only needs 2 ticks, sometimes three. So why not use option (3) and just check to see if we really need the extra tick? Option (4) is a non-starter because we don't have an arbitrary number hardware timers, but we do have an arbitrary number of threads that might sleep. Option (5) is interesting, and may be possible. However it requires an API to be developed that will read the hardware timer registers being employed for the system tick, which I am sure is different across all the different architectures supported. If we know exactly when the next system tick is about to occur, we can correctly calculate how many ticks we need to sleep without subsequently double checking the monotonic clock. This would be awesome, but a lot more work. It might not even be possible for some reason I am unaware of. The fact that is the same approach used in
As I already mentioned, calling EDIT: One additional caveat/issue which is not fixed by this PR is calling |
behavior varies for different sleep durations |
Thanks, that covers my addendum edit. The original issues remain where calling sleep_for() with a time larger than the system tick will cause a portion of the time to block wait in IDF5, and usleep() almost always sleeping for less than the specified time when given a time more than a tick period. |
@stevenoonan but the guaranty is not broken with |
@safocl The main point is that the guaranty of |
Description
The current implementation of
usleep()
can return in the less the specified time, which breaks assumptions callers ofusleep()
have. This proposed change ensuresusleep()
never returns in less than the specified time.From
man 3 usleep
:Returning before the specified time can cause great inefficiencies when using C++'s
this::thread::sleep_for()
.I stumbled upon this when I upgraded a project from IDF 4.4 to IDF 5.3.1 (GCC got updated to 13.2). After updating, my project was running sluggish. I profiled the code and found a thread was using way more processing time than before. It was a simple loop with a
this:thread::sleep_for(10ms)
call.Specifically, the changes of this commit gcc-mirror/gcc@cfef4c3 in libstdc++'s
__sleep_for()
is what causes behavior to change. But the problem is inusleep()
. In the older version of libstdc++__sleep_for()
,usleep()
is always called once. The newer version added a while loop and a check to the monotonic clock to verify the sleep did not come back early, and if it does, it keeps callingusleep()
.The only documented reason a sleep should return early is due to a signal, but of course that is not the case on the ESP32.
The reason
usleep()
returns early is because it does not compensate for a worst case systick ISR timing. Let's play out an example. If we wanted to sleep for 15ms where the tick period is 10ms, the calculations would give usvTaskDelay(2)
:portTICK_PERIOD_MS
is defined as10
. Sous_per_tick
is10,000
.us
is15000
(15000 + 10000 - 1) / 10000)
is2.4999
, so2
ticksTwo ticks has a minimum of 10ms, because the first tick ISR can happen immediately, so you can only count on n-1 tick periods of time. So of course 10ms is less than 15ms. The current
(us + us_per_tick - 1) / us_per_tick
compensation attempt is broken. If you do the math for a 10ms wait you'll get1
tick, which has a minimum of 0 tick periods. But even if it does get half (or 90%) of a period of time slept, we still haven't slept the minimum time we should have.This means
usleep()
will return back to libstdc++__sleep_for()
, the monotonic clock check fails, andusleep()
is now called again, with the remainder of time. This will always been less than a tick period, soesp_rom_delay_us()
is now called. I don't have the source code foresp_rom_delay_us()
but I assume it is a blocking busy wait.Any time spent in
esp_rom_delay_us()
is quite inefficient for scheduling. If the thread blocking here is the highest priority thread on the system, it will prevent FreeRTOS from time slicing to any other ready state thread during that time. That was exactly what was happening in my project.My fix is to double check the monotonic clock in
usleep()
, and it willvTaskDelay()
an additional time when needed. This guaranteesusleep()
will never return early, but sometimes later. Which is to spec, and also, it means libstdc++ won't callusleep()
in a way that will causeesp_rom_delay_us()
to be called unexpectedly.For 15ms sleep with a 10ms systick period, the absolute minimal tick count is
2
. My change does a simple division of equivalent of(15000/10000)+1
which gives2.5
or int2
. In the worst case, another tick is needed because as stated above, a tick count of 2 means a worst case of only 10ms waited, so the loop will check for that and callvTaskDelay()
again.Related
__sleep_for()
__sleep_for()
Testing
On my own project, as stated above.