Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http://www.alexeyfv.xyz//2024/01/15/class-vs-struct.html #22

Open
utterances-bot opened this issue Dec 1, 2024 · 1 comment
Open

http://www.alexeyfv.xyz//2024/01/15/class-vs-struct.html #22

utterances-bot opened this issue Dec 1, 2024 · 1 comment

Comments

@utterances-bot
Copy link

The 16-Byte Rule: Unraveling the Performance Mystery of C# Structures | alexeyfv

Many C# developers are familiar with the following statements. By default, when passing to or returning from a method, an instance of a value type is copied, while an instance of a reference type is passed by reference. This has led to a belief that utilizing structures may degrade the overall app performance, especially if the structure has a size greater than 16 bytes. The discussion about this is still ongoing. In this article, we will try to find the truth.

http://www.alexeyfv.xyz//2024/01/15/class-vs-struct.html

Copy link

I was also interested in this topic and came to the same results in the benchmark. Structures up to 64 bytes are copied almost in constant time. Structures of 64 bytes experience degradation of about tens of percent. From 64 to 128 bytes, there is a sharp degradation of approximately 170%, and then linear slowing of copying as the structure size increases by 2, 4, 8 times until the structure can no longer fit into AVX registers.

The disassembled code in my case (Zen 3 architecture) shows the use of 256-bit YMM registers. But why does sharp degradation start at 64 bytes? The disassembled code does not provide an answer to this. The article did not cover this, but I assume that the CPU pipeline architecture, with its 8-way associative block (in the case of Zen 3), comes into play and becomes a bottleneck.

In the newer Zen 5, the associative block has been expanded to 12-way, and if the author happens to have such a CPU, it would be very interesting to see a new benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants