-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpha 17 B5eek: Something weird about fan-speeds #164
Comments
For many devices, RPM might be set to the wrong address and scaled incorrectly. Actually, EC show not RPM, but % of RPM in range 0-150. Someone in the past tried to "normalize" CPU % RPM to 0-100% range and now it returns some wrong values. Fans turned on-off accordingly to curve, with some hysteresis. Fan mode like silent/auto/advanced just limits max available %RPM to some value without any scaling IDK what is turbine mode |
rpm-readings vs percent-readings, isn't my point (I'm aware of that). (1881/5558)×100=33%, msi_ec shows 43(%). "turbine mode"=fans@maximum, done by FN+Arrow up |
Not all devices have turbine mode, but many have Where you got MSI EC don't calculate percents (except broken CPU %RPM meter, which need to be removed) Don't look onto CPU RPM reported by driver |
Yes, it is the same thing. 5558 = the maximum cpu-fan-rpm (on "turbine mode") for my device, so 100% The rpm values I've got from my own prog's readings as described in the initial post. |
You can assume that boost speed isn't 100% but 150 or 200, plus correlation may be non-linear |
No, I won't. |
MSI ec did not control fan curve, but I want to fix realtime %rpm readings soon |
In the meantime affected people can use my forked repo for this device. Reads rpm values from the correct addresses. |
yeah well it's only logical that these addresses are messed up, i was so concentrated on getting now that i remember correctly, i used ec_sys module readings for fans speeds, and not the actual driver itself. by the way @Freihut your repo works kinda well, the realtime_fan_speed file in |
That's fine, as they both should read from the same source.
That's were my changes are, so it's not "well" at all. :c
If |
@Freihut before i continue testing the fans speed readings with you, i'd like to confirm a few things in advance:
please do all of these under linux, thanks. P.S: what you call turbine mode is actually turbo boost. |
just a bunch (less than 10) of
What is that question for? That's reported by amdgpu (which's just passing firmware readouts) and more or less reasonable. ("More or less" because values reported by the firmware are "meh").
Around 50°C, depending on room-temp.
According to amdgpu it is 65w. With Furmark and smartshift enabled I can push the dGPU to around 68w, but /sys/class/drm/card[X]/device/hwmon/hwmon7/power1_cap_max still reports 65w.
My device reports fan-rpm-speeds on 0xcb and 0xcd even for BIOS defaults. Settings I've changed and can remember: Smartshift, secure boot and modern standby off, UMA for iGPU to 512Mib. But like I wrote: I used these addresses for about 1~2 years now and they never changed and always report plausible speeds. At least for my device.
Ya, I know, but turbine mode sounds better. :) BTW, I just made a gui-tool to live view the ec. It highlights changes and does some math to help find fan-speed-addresses. But its pretty alpha right now. |
the reason i asked you these questions is that i'm trying to see if the driver is functioning properly before re-checking other addresses, for example: disabling smartshift from bios will prevent the ec from doing any actual performance changes when you change disabling modern standby will reset all the power/performance changes after waking up from sleep, you'll have to re apply them by re selecting the performance mode ( i asked you for gpu usage because the vbios has an issue that makes it report 99% on almost any load.
seems like smartshift doesn't work on linux for some reason. users of the alpha 15 reported that it works fine, after further searching i found out that the RX6600M vbios is different from the one found on the alpha 17 ; i assume that flashing alpha 15 vbios might fix the issue, but it might brick your laptop.
just tried it out and its really cool, hopefully it will make it easier for people to test if the driver is working correctly on their laptops or not, thanks for your work. |
Thanks for explaining.
I can remember that this occured to me some days ago after standby. But I just tried to reproduce that and both gpus keep reporting sane utilization values. Weird. (No updates happened between these situations).
It kinda does, but in a weird way and it keeps changing as the kernel progresses. 2 years ago smartshift shifted alot to the gpu (if I remember correctly it ran at about ~85w and the cpu dropped to 2,5 Ghz). With the current kernel it shifts about 3w, but very slowly (you can see that the gpus power draw increase over several minutes of load). Any value to the somethingbiassomething-file had no effect. Smartshift also has some side effects on ryzenadj, but I couldn't figure out what exactly happens there.
Thanks for the feedback, I'm glad to help. |
I did my testing and @Freihut is right:
Values contained in these 2 addresses are percentages for the target speed, not actual speed in rpms; There seems to be a mismatch between the values reported by ec_sys and msi-ec: so its only possible to load the file if the target is between 25% to 55%. |
lets fix things one at a time, correct addresses take priority, @Freihut do you want me to fix it or do you want to make a merge request yourself? |
Wait a minute, you can't just fix the addresses, because this needs a rather big overhaul in calculating the fan speeds. Look at the way I calculate the rpm in my forked code. But this works only for the Alpha 17 b5eek (and of course devices using the same fans). To fix this for all users you'll need to add the Fallback-rpm for each device currently supported or find the addresses to make msi-ec read that out by itself. |
@Freihut i think the only way to verify speeds is to use apps like HWMon on windows and compare the reading to the ones we have in linux.
sounds like unnecessarily complicated solution to me. i've seen how you calculate rpm and honestly i don't have much to say, @glpnk has dealt with this more than me, so i'll leave him to decide. right now, @glpnk made a draft pull request #172. but until thats ready, lets make sure the files for real time temperatures are readable at the very least and not cause text editors to lock up and crash (thats what happens to me actually). No matter how you calculate the speed, you need the correct addresses to get the right data to work with. so, would you like to open a pull request fixing the realtime fanspeed for both cpu and gpu? |
Once I'll finish cleanup and made SYS-FS API for fan tuning and RPM readout (if divide math operation is safe to do in kernel modules). |
Laptop model
Alpha 17 B5eek
EC firmware version
17LLEMS1.106
Description
Tl;dr
cpu-fan-speed: seems incorrect
gpu-fan-speed: plausible, but somehow not in "turbine mode"
I've got some weird readings here:
Situation 1:
Created some cpu-load while running:
watch --interval 1 cat /sys/devices/platform/msi-ec/cpu/realtime_fan_speed
(combined output of several seconds)
Pluma (a text editor) also throws the "Invalid argument" at the same time, so likely not a cat issue.
Situation 2:
Idle + FN + Arrow up (which makes the fans go into "turbine mode") but msi-ec/cpu/realtime_fan_speed reports "43", while msi-ec/gpu/realtime_fan_speed reports "0".
Meanwhile I get the attached output while reading the ec (/sys/kernel/debug/ec/ec0/io) by a small pascal prog I used before.
Line 1 = the dump of the whole ec-line
Line 2 = the gpu-rpm-speed
Line 3 = the cpu-rpm-speed
Interval is 1000ms.
output1.txt idling laptop, just going into "turbine mode" and went back to normal after some seconds. Msi-ec reports "43" for cpu and "0" for gpu all along.
output2.txt laptop has full cpu load. Cpu-fan is around 3900rpm, while gpu-fan is at 0 and gets turned on, when the gpu reached 55°C (as the case gets warmed up I guess).
Msi-ec reports "invalid" for cpu all the time and "0" for gpu in the beginning, later it went up to 43, which is kind of plausible.
The pascal prog I was using for around 1 year all the time, so I'm fairly sure the readings are correct, at least they're plausible.
I'm using the latest BIOS E17LLAMS.10B from 2023-06-15 with Arch Linux on Kernel 6.11.0
(the pascal prog src can be compiled with Lazarus; needs to be run as root (to read /sys/kernel/debug/ec/ec0/io) while ec_sys module is running)
output1.txt
output2.txt
read_ec.tar.gz
The text was updated successfully, but these errors were encountered: