-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factorio crashes under Steam Linux Runtime 1.0 if uid not in /etc/passwd, e.g. systemd-homed #705
Comments
To clarify, is that running it by copying the installed game files and running it like an independent non-Steam game, without Steam being involved at all? Or do you have both Steam and Factorio installed in the VM? |
Since you've already discovered that developer tool: what happens if you switch the container runtime to I don't see any immediately obvious problems in Factorio with bundled libraries or anything like that. Presumably it's making some sort of assumption about the host system that isn't true any more when it runs in a container, but it's hard to say what that assumption would be. We've had people running Factorio successfully in Steam Linux Runtime 1.0 (scout) in the past (#262) but presumably it doesn't work in all system configurations. |
I'm on Arch and Factorio launches just fine in scout SLR fwiw. |
About Arch Linux VM with Steam and Factorio: I was installed steam, from steam installs Factorio then launching Factorio from steam in that virtual machine. About the developer tool: No, changing runtime does not solve this issue for me. |
I made a test user with
systemd-homed managed user directory is located at |
Interesting... Your SLR log says we're using I also notice this in your log:
so maybe this means there's a problem with X11 or Wayland? Do other Steam games launch successfully in the same runtime? Floating Point is a good one to test, because it's very small (and is free). |
It would be useful if you can get a new log with |
about the $ file /home/zhaow
/home/zhaow: directory
$ realpath /home/zhaow
/home/zhaow
$ ls /home
zhaow zhaow.homedir zwydbg
$ mount | grep /home
/dev/nvme0n1p2 on /home type btrfs (rw,relatime,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@home)
/dev/nvme0n1p2 on /home/zhaow type btrfs (rw,nosuid,nodev,relatime,idmapped,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=264,subvol=/@home/zhaow.homedir) I'll collect logs as soon as possible. Please wait for a moment |
Yes, I agree. (The reason I asked is that symlinks sometimes break container frameworks, including ours - but your home directory isn't a symlink, so that should be OK) |
I collected 3 log files this round:
In the console log you can tell I launched factorio with developer tool |
I'm abandoning this issue and I'll move to a normal non-homed user. Seems |
I notice that you're using AMDVLK, which has been known to cause weird issues in the past. We generally recommend Mesa's Vulkan driver for AMD GPUs. I also notice that you're using a dual-GPU setup (discrete + integrated GPUs, both AMD) which can sometimes have weird effects. It's weird that it makes a difference whether you're using systemd-homed, though... I wouldn't have expected that to have an effect. When you tested without systemd-homed, was it with the same If someone can reproduce a similar issue, the next step would probably be to see whether this affects all games or just Factorio, and either reproduce the crash with something open-source that we can analyze (like maybe |
@smcv I can reproduce the issue. running ubuntu 24.10. Up to date. how can I help |
after some sleep, I'm back :)
this is because
my system is 7800X3D + 7900XTX, with monitor plugged into GPU directly, which means the integrated GPU is mostly idle.
it's slightly different - the zwydbg user is freshly created with
I'm working on this. I did obtain strace log awhile back when I reported this issue to Factorio Forum. I'll try get some fresh logs and reproduce with other native title. I should note that Floating Point works inside SLR; next target would be Dota 2 for me (since it's effectively open source to you guys, right?) |
some test result, by launching each game from steam:
|
I ran Civ 6 and Factorio with you can use |
I think Stellaris have same issue as Civ6 and Factorio. Paradox launcher for stellaris can start, but the game failed to launch. Here's strace logs for stellaris: |
One additional note, I tried to launch |
re-opening this since my setup for reproducing the issue is still valid, hope we can solve this mystery together :) |
It's probably best if you can open a separate issue for this: if your issue with Factorio does not affect most of your games, then it seems likely that Civ 6, Factorio and Stellaris have different things going wrong. And if I'm wrong about that and there is a common root cause, closing issues as duplicates is much easier than understanding an issue thread that has three separate conversations about three separate bugs :-)
Similarly this is probably best as its own issue. |
Civ 6 has had known compatibility problems in the past because it bundles a lot of libraries that it shouldn't, so definitely open a separate issue for that one instead of discussing Civ 6 further on this particular issue. |
OK, let’s focus on Factorio for now, as I’m not currently playing Civ 6 or Stellaris. Could you please guide me on how to properly set up gdbserver with Steam Linux Runtime enabled to obtain a stack trace for the SIGSEGV error when launching Factorio? I’ve attempted this before, but my last try resulted in an internal error from gdbserver—specifically, it mentioned an unknown register ymm0h or something similar. I suspect that the game is utilizing AVX2 registers, and the gdbserver bundled with the Steam Linux Runtime may be outdated. |
I downloaded the steam-runtime SDK (soldier, according to I got this stack trace with SIGSEGV (finally!)
I'll report this issue to Factorio dev and I'll try dig a little deeper. |
I see you've figured out a way to get a backtrace while I was writing this, but for completeness...
To get a stack trace, it's often simpler if you can use a post-mortem crash analysis tool like Or, the next best thing is:
|
I don't currently have access to Factorio the full game, but for what it's worth, the demo is working fine for me on Arch Linux under SLR 1.0. However, I haven't yet tried it with a user that is managed by One thing I notice from your backtrace:
This seems like it indicates that Factorio was compiled with a third-party compiler, and not with one of the ones we provide in the Steam Runtime SDK. The demo shows signs of having been compiled with the same compiler. This hopefully shouldn't be a problem: the demo is statically linked with libstdc++, which probably means the full game is the same. Looking at the demo executable with I don't have any real evidence that this is the reason for your crash, though. |
This is just speculation, but one thing that occurs to me is that |
If the Factorio developers can tell us what's happening in that function (and, more specifically, around that line), that would probably be the most useful piece of information here. |
I can reproduce what appears to be the same crash by logging in as a user that is managed by It is not necessary to be using a btrfs subvolume or any other elaborate storage mechanisms, but either Steps to reproduce:
Backtrace:
|
Hello @raiguard, this issue report might interest you, at least to listen in on the ongoing discussion. |
Factorio demo version has these lines in
at first launch the game would generate lots of configuration files according to these two lines, including
If I change
Edit: I'd like to explain these behavior from my POV.
|
If a Factorio developer can look at this issue, a tl;dr is:
This could either be a bug in some library that is part of the container runtime, or a bug in Factorio, or an assumption that it makes about the system becoming untrue, or some subtle interaction between multiple components. The change that prompted this is that until recently, most native Linux Steam games on desktop were run in the legacy |
Thanks for the ping, this is on my radar. Unfortunately this is badly timed, because I am currently on a 3-week vacation in Japan so my work output is a bit limited. My ideal solution would be to not run the game against the steam Linux runtime at all - we provide a standalone version of the game and it works great. I haven't been able to reproduce the issue on my laptop (Framework 13 running Fedora 41) yet, but I will work on it more tomorrow. Thanks for the detailed reproduction steps! |
Perhaps the way this special token gets expanded in the Factorio code is relying on some assumption that is not true in the container environment. I can see one possible issue with this: If you are using At the moment, we copy the system Normally, this doesn't matter, because when application code wants to know about the current user, it usually only wants to know the home directory, which usually respects the However, if Factorio doesn't take I would recommend that Factorio should try We can also mitigate this from the Steam Runtime side, by programmatically generating an |
I think I got this minimized down to a non-game example. I searched around the Internet and landed myself on this post. Which indicate that Factorio |
I'm sure it works great today, but the goal of the Steam Linux Runtime is that it still works in 10 years' time, and that's hard to achieve in a standalone Linux binary - assumptions about the underlying system that seem completely reasonable today are not going to remain true forever. |
That's consistent with my theory in #705 (comment), and confirms that users do expect |
You are correct. Here is the entire contents of Filesystem::Path Paths::getSystemWriteData()
{
struct passwd* pw = getpwuid(getuid());
const char* homedir = pw->pw_dir;
return Filesystem::Path(homedir + std::string("/.factorio"));
} Ironically enough, I actually did catch this flaw a few months ago, but the fix didn't get merged because it was bundled with a few other changes that were rejected (the change being that we would use
Point. I tend to try not to think about the day when I inevitably stop working on Factorio. :) |
Yeah, that's the segfault I expected: if const char* homedir = getenv("HOME");
if (!homedir) {
pid_t pid = getuid();
errno = 0;
struct passwd* pw = getpwuid(pid);
if (!pw) {
errx(1, "Unable to find uid %d: %s", pid, errno ? strerror(errno) : "not found");
/* or whatever way you prefer to handle fatal errors */
}
homedir = pw->pw_dir;
}
... (The error behaviour of |
@zhaoweny or @kisak-valve, can we perhaps retitle this to something like I'll look at mitigating this from the SLR side. |
I edited the title as you suggested, but I'd like to add that it's same behavior across different Steam Linux Runtime versions. |
That makes sense, it's a problem with SLR in general rather than that version specifically. (But SLR 1.0 is (currently) the only one that is available for running Factorio without using unsupported tweaks, because SLR 3.0 is only meant to be for games whose developers have specifically told us they want a newer runtime, like CS2 and Retroarch.) |
In the short term, a workaround for this is to append a record for the In some brief testing on Arch, the result of For example, on my test system,
but when I log in as
Obviously this workaround loses a few of the benefits of systemd-homed, so it would be better to make SLR mitigate this failure mode (in progress) or to teach Factorio to use |
I prototyped this and it seems to resolve the crash, at least for the demo. If you're comfortable with using unreleased software, you can try this out by replacing This change will hopefully be part of the next Steam Linux Runtime 2.0 beta when it has been through review and more testing. Because of the way the container runtime works internally, this would be a change to SLR 2.0, and not SLR 1.0 as you might expect. [note to self: this is !767 v4] |
@adomaskizogian, I don't have enough information about your system or your situation to guess whether you were experiencing the same bad interaction between If your issue was the same thing originally reported here, then the pressure-vessel build in #705 (comment) should hopefully resolve it. Or, if that isn't it, please open a separate issue with the info/logs that are requested by the issue template, and we can look into that separately. |
Looking at your strace logs, I think you might be correct to have thought that this is actually the same issue as Factorio, either in Civ 6 itself or in some library that it uses. The end of the log for process 134315 looks like the same order of operations I would expect from what Factorio does:
So it would be useful if you could retry Civ 6 with the pressure-vessel build from #705 (comment), or with the workaround from #705 (comment).
Stellaris shows a similar pattern, so it would be useful if you could retry Stellaris in a similar way. |
I use systemd-homed and am affected by the crash as well. I can verify this fixes the issue and Factorio starts. |
I was busy playing Factorio last night (It's a great game!). I will try this fix tonight when I get home. |
I tested Civ6, Stellaris, and Factorio (full game, version 2.0.17). They all works with pressure-vessel fix. Thank you, for your hard work and excellent support! |
@zhaoweny: Would you be able to get a backtrace from Civ 6 and Stellaris, with a method similar to what you did for Factorio in #705 (comment) ? If we can find out where their similar pattern is happening (in the main executable, or in some library that they use), that would give us better information to report to those games' developers. You can use |
Sure, here's backtrace (and a small section of disassembled code) for stellaris:
Here's Civ6 under same Steam Linux runtime, a bit of backtrace and some disassembled code:
If disassembling code is not welcome here, please tell me :P |
Thanks! I was half expecting you to report two matching backtraces, indicating that Aspyr and Paradox were both linking to (or perhaps even bundling) the same utility library; but it seems that instead, they've each made the same mistake independently. Do I assume correctly that both of those are somewhere inside their respective games' main executables? |
Reported to Aspyr, for Civ 6 (ticket 233908) and to Paradox, for Stellaris (ticket 308296). I'm assuming we don't need a support ticket for Factorio since a developer is already in this conversation. For best robustness I'm hoping we can get this fixed from both sides, in SLR and in the affected games. |
I have merged the fix into Factorio - the game will now prefer However, I am unable to test this because, as I mentioned before, I am on vacation in Japan with limited resources. I would kindly ask those affected by this to test the next experimental release (2.0.20) when it is released and let me know if there are issues. |
@raiguard |
Your system information
steamapps/common/SteamLinuxRuntime/VERSIONS.txt
? 0.20240806.0steamapps/common/SteamLinuxRuntime_soldier/VERSIONS.txt
? 0.20240917.101880steamapps/common/SteamLinuxRuntime_sniper/VERSIONS.txt
? 0.20240916.101795Please describe your issue in as much detail as possible:
When launching Factorio on my physical Arch Linux machine, it crashes almost immediately. I tried reinstall Steam, reinstall Arch Linux then reinstall Steam, the issue presists.
Currently I found 3 workaround:
-compat-force-slr off
steam-runtime-launch-options -- %command%
and configure container runtime toNone
FYI same issue on Factorio Forum
attaching slr log file as requested: slr-app427520-t20241108T233119.log
Steps for reproducing this issue:
expected behavior: game loads up loading screen, lands me on main menu
actual behavior:
The text was updated successfully, but these errors were encountered: