How could I use mlc to do UPI stress and make the percentage utilization of UPI up to 100% ? #517
Replies: 9 comments
-
could you please share the output of MLC when MLC is run with default (no) parameters? AFAIK the UPI protocol has variable data packing efficiency (depends on traffic pattern). In some cases it can reach 35.8 GBytes/second. |
Beta Was this translation helpful? Give feedback.
-
I post the mlc output as text. |
Beta Was this translation helpful? Give feedback.
-
According to the MLC output even for local accesses the max bandwidth on your system is 60 GByte/sec. The UPI is not the limiter here, you are limited by the bandwidth of DRAM on your system. The three UPI links can support more bandwidth (the incoming UPI data traffic in your screenshot was 60 Gbyte/sec as well). How many DIMMs are populated in each of the memory channels? If not all channels are populated or there is uneven population the bandwidth will be limited. To see which channels are active and how much bandwidth they are delivering you can run "mlc --peak_injection_bandwidth -t180" and pcm-memory tool in parallel. Please post pcm-memory output here. |
Beta Was this translation helpful? Give feedback.
-
my system have 2 DIMMs (64GB*2=128 GB) Thanks for your reply. |
Beta Was this translation helpful? Give feedback.
-
this confirms the issue. You have just one DIMM in each of the sockets (in channel 0). You need more DIMMs to get more bandwidth. Ideally 1 DIMM in each channel. There are 8 channels available on each socket. |
Beta Was this translation helpful? Give feedback.
-
I see. I would increase DIMMs on my system and try again. Thank you very much. |
Beta Was this translation helpful? Give feedback.
-
I tried to increase memory size up to 1TB. The max bandwidth on my system increased to 530 GByte/sec. Finally, I got 100% utilization. Measuring Peak Injection Memory Bandwidths for the system Thanks for your help. |
Beta Was this translation helpful? Give feedback.
-
I have another question. Why UPI outgoing bandwidth is larger than UPI incoming bandwidth ? And why socket0 and socket1 's UPI incoming bandwidth are different ? Thanks. |
Beta Was this translation helpful? Give feedback.
-
The incoming traffic metric includes only data component but the outgoing traffic includes both data and non-data components (please see the description in your screenshot). |
Beta Was this translation helpful? Give feedback.
-
Hi Developers,
I'd like to use PCM Tools to monitor UPI traffic and execute UPI stress on my server. There are two processors(Intel Sapphire Rapids) on my server.
I create a test.cfg and use "./mlc --loaded_latency -otest.cfg -d0 -T -t180" to run UPI stress. But the percentage utilization of UPI get only 54%.
I have some questions that need your help.
Q1:How could I use mlc to run stress and make total percentage of UPI bandwidth get close to 100%?
Q2:Why Max UPI link speed is 35.8 GBytes/second instead of 32 GBytes/second(16 GT *2Bytes/per channel) ?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions