-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Error
and Result<()>
to be the same size as HRESULT
#3126
Allow Error
and Result<()>
to be the same size as HRESULT
#3126
Conversation
I'm not very familiar in this area. Is that a guaranteed and/or documented compiler niche optimization? Would be awkward if this changed and resulted in net zero or negative savings. |
It's a guaranteed optimization. It's the reason that the There are static assertions in the PR that verify this. If anything changed, we would see a build break. But these optimizations are guaranteed, so these assertions are more about documentation than anything else. |
Is there any benchmark data that demonstrates this is more than just a theoretical optimization? Not trying to be a blocker, just trying to prevent unnecessary downstream churn. |
Yes. This change alone removes 1.2% of the entire binary size for DWriteCore. The code is smaller, the exception handling tables are smaller. Text layout benchmarks show meaningful increases in speed as well, which makes sense because these components make extensive use of COM, including purely inside DWriteCore, not just between DWriteCore and external apps. Keep in mind that DWriteCore is not yet pure Rust; it's about 65% Rust. So that number will go even higher as I convert more of DWriteCore to Rust. Or, to put it a different way, I see this much regression when I move from |
The size optimization looks neat. As someone who spent a great deal of effort optimizing code gen for call sites in C++, I can see how this would make a meaningful improvement. Mainly I'm concerned about the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the change to crates/libs/windows/Cargo.toml
intentional?
Something seems wrong with the let error_from_win32 = windows_result::Error::from_win32();
|
Hmmm. I may have to re-introduce the It's strange, though, because the NatVis tests pass locally. |
I wouldn't spend too much time on it. I've spent a heavy amount of time with natvis and due to the split in expression engines (Visual Studio / windbg), it's very difficult to get right. Adding VS expression engine tests to start chipping away at this problem is on my TODO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet - thanks!
Many codebases use
HRESULT
(whether they use COM or not) and yet never useIErrorInfo
. For those codebases, the size of theResult<()>
object can be a substantial cost. Currently on 64-bit platforms,size_of<Result<()>>
is 16 bytes, when ideally it would be basically a smartly-typed version ofHRESULT
. For example, Direct3D, DirectWrite, etc. never useIErrorInfo
(they don't set any error info objects on failures), so all of the code that uses these codebases will suffer in performance.This PR allows you to statically disable the error info within
Error
, by enabling theslim-errors
feature in thewindows-result
crate. (Thewindows
andwindows-core
crates will also propagate the feature towindows-result
, so that you don't need to add an extra crate dependency.)The performance costs come from the larger code and larger stack area needed. There is also a significant penalty because
IErrorInfo
requires that theError
object have aDrop
implementation. This requires the compiler to precisely track the lifetime ofError
values and even of theResult<()>
value. This adds a lot of extra code that almost never runs, to many function bodies.This PR also adds a new
NonZeroHRESULT
, which is equivalent toHRESULT
except that it contains aNonZeroI32
, not ani32
. This means that it cannot representS_OK
, which is fine. This gives the Rust compiler a "niche", which is a byte pattern that cannot be represented by one variant of an enum, which frees up that byte pattern for a different variant or for enum's "tag". In our case, the compiler will generate type layouts forOption<NonZeroHRESULT>
andResult<(), NonZeroHRESULT>
that are the same size and alignment as the underlyingHRESULT
. This is great because it allows them to be stored in a single register, and in many situations the compiler can totally avoid type layout conversions and can produce much better code.The
Error
type now usesNonZeroHRESULT
. When theslim-errors
feature is enabled,size_of::<Error>() == 4
andsize_of::<Result<()>>() == 4
. Before this change,size_of::<Error>() == 16
andsize_of::<Result<()>>() == 24
.This PR improves these types even when not using
slim-errors
. After this PR, withslim-errors
disabled,size_of::<Result<()>>() == 16
, 8 bytes less, because of the niche optimization relating toNonZeroHRESULT
.