Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Futex implementation is quite complex and still wrong at times #13443

Open
BlocksOnAChain opened this issue Dec 17, 2024 · 0 comments · May be fixed by #13754
Open

Futex implementation is quite complex and still wrong at times #13443

BlocksOnAChain opened this issue Dec 17, 2024 · 0 comments · May be fixed by #13754
Assignees
Labels
MT cannon - audit findings grouping for audit findings MT cannon - Mainnet relevant issues needed to complete the work for our Mainnet release

Comments

@BlocksOnAChain
Copy link

Description
Except from finding #7, the FUTEX_WAIT and FUTEX_WAKE mechanism is probably sound, considering how applications make use of futexes, but not the correct behavior.

To be consistent with Linux:

Changing the memory value must not automatically wake the thread;
Keeping the memory value must not prevent it from being woken. Another thread calling FUTEX_WAKE is all it takes;
Calling FUTEX_WAKE needs to wake as many threads as supplied in val, not just one.
That said, any sane application will be protected from spurious wakes, thus countering issue 1., and they will only rely on the futex state to decide whether to wake up or go back to sleep, countering issue 2. Issue 3. is countered by the fact that every thread is scheduled eventually, regardless if it is asleep or not, so threads not awaken by FUTEX_WAKE will be awaken by the scheduler, provided the futex value allows it.

So, the current behavior is sound if the application is not doing anything "creative", and just using futex as intended (i.e. using it as a semaphore/mutex).

That is a fair assumption that can be expected from normal applications.

The problem is: there is a much simpler implementation, with better performance, that only requires the application to be protected from spurious wakes (pretty much a requirement for using futex(), documented in the man page).

Recommendation
Ignore futexes almost entirely, remove state entries related to them, and treat FUTEX_WAIT as if (*uaddr == val) { sched_yield(); } (mind finding #7, as it still applies in this case), or even simpler, as a plain sched_yield(), as the only possibility of *uaddr != val is if the thread was preempted between the userspace variable check and the futex() call, which is a very rare occurrence, and there is no harm in yielding anyway.

Given the static nature of the scheduler, it may be a good idea that FUTEX_WAKE is treated as sched_yield(), too, so to prevent starvation in case there is a thread spending most of its time holding a lock of interest to some other thread (not the case in most well written programs).

In this proposal, every thread is waked every time. This makes sense in MTCannon because every thread is rotated every time through the scheduler, so there is very little gain in number of state steps in skipping those threads. This is unlike Linux, where the vast majority of tasks are asleep and not even considered for scheduling.

Of course, this is the simplest extreme. A middle ground would be to remove the state.wakeup and the wakeup procedure, as it adds nothing.

packages/contracts-bedrock/src/cannon/MIPS64.sol
// Search for the first thread blocked by the wakeup call, if wakeup is set
// Don't allow regular execution until we resolved if we have woken up any thread.
if (state.wakeup != sys.FUTEX_EMPTY_ADDR) {
if (state.wakeup == thread.futexAddr) {
// completed wake traversal
// resume execution on woken up thread
state.wakeup = sys.FUTEX_EMPTY_ADDR;
return outputState();
} else {
bool traversingRight = state.traverseRight;
bool changedDirections = preemptThread(state, thread);
if (traversingRight && changedDirections) {
// then we've completed wake traversal
// resume thread execution
state.wakeup = sys.FUTEX_EMPTY_ADDR;
}
return outputState();
}
}
packages/contracts-bedrock/src/cannon/MIPS64.sol
// check if thread is blocked on a futex
if (thread.futexAddr != sys.FUTEX_EMPTY_ADDR) {
// if set, then check futex
// check timeout first
if (state.step > thread.futexTimeoutStep) {
// timeout! Allow execution
return onWaitComplete(thread, true);
} else {
uint64 mem = MIPS64Memory.readMem(
state.memRoot,
thread.futexAddr & arch.ADDRESS_MASK,
MIPS64Memory.memoryProofOffset(MEM_PROOF_OFFSET, 1)
);
if (thread.futexVal == mem) {
// still got expected value, continue sleeping, try next thread.
preemptThread(state, thread);
return outputState();
} else {
// wake thread up, the value at its address changed!
// Userspace can turn thread back to sleep if it was too sporadic.
return onWaitComplete(thread, false);
}
}
}

@BlocksOnAChain BlocksOnAChain added MT cannon - Mainnet relevant issues needed to complete the work for our Mainnet release MT cannon - audit findings grouping for audit findings labels Dec 17, 2024
@Inphi Inphi self-assigned this Dec 19, 2024
@pauldowman pauldowman assigned mbaxter and unassigned Inphi Jan 7, 2025
@mbaxter mbaxter moved this from TO-DO to In progress in Proofs team - Project Delivery tracking board Jan 8, 2025
@mbaxter mbaxter linked a pull request Jan 14, 2025 that will close this issue
@pauldowman pauldowman moved this from In progress to In review in Proofs team - Project Delivery tracking board Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MT cannon - audit findings grouping for audit findings MT cannon - Mainnet relevant issues needed to complete the work for our Mainnet release
Projects
Development

Successfully merging a pull request may close this issue.

3 participants