this post was submitted on 05 Aug 2024
22 points (89.3% liked)

Linux Gaming

15789 readers
5 users here now

Gaming on the GNU/Linux operating system.

Recommended news sources:

Related chat:

Related Communities:

Please be nice to other members. Anyone not being nice will be banned. Keep it fun, respectful and just be awesome to each other.

founded 4 years ago
MODERATORS
 

I'm on Kubuntu 24.04, rocking a build that was pretty darn high end in 2021 with an AMD 6800 XT, and of course, Wolfenstein: The New Order was already old news by then. Proton does miracles, but this game freezes my entire machine. The last time I saw something like this happen was with Monster Hunter World in 2018, on a much older version of Proton. I can reliably get the game to freeze my machine in the opening level of The New Order, even across multiple versions of Proton, even with the renderapi launch parameter that should switch it back to OpenGL. Of course, even if I report this to Steam support, they'll tell me that they only support Steam Deck and not bespoke Linux desktops, and the game works fine on my Steam Deck, but would they be interested in some logs and a bug reported against the GitHub project? This is assuming no one here has an easy fix, of course. But if not, how would I get the logs? I wouldn't know what I'm looking at in those logs, personally. I'm also not sure if they'll write out correctly. Because it freezes the entire machine, I end up having to hard shut down the computer by the power button, and once or twice during my experiments, it failed to mount my game SSD (a separate drive from where my OS is installed) at boot, and I had to set up the automatic mount in the partition manager again. So assuming that doesn't impact the ability to write out the logs, I can collect them with some instructions, if you kind strangers in the know wouldn't mind providing them, please. And if Valve is interested in looking at them.

you are viewing a single comment's thread
view the rest of the comments
[–] ampersandrew@lemmy.world 2 points 3 months ago (1 children)

Alright, from journalctl, I can for sure identify exactly where my computer hung. That last line that repeats itself? It repeats itself thousands of times until I shut the machine down. Does that mean anything to you?

Aug 03 12:58:26 Compy-5600X steam[4057]: 08/03 12:58:26 minidumps folder is set to /tmp/dumps
Aug 03 12:58:26 Compy-5600X steam[4057]: 08/03 12:58:26 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(20240716232148)/tid(1798223)
Aug 03 12:58:26 Compy-5600X steam[4057]: 08/03 12:58:26 Init: Installing breakpad exception handler for appid(gameoverlayui)/version(1.0)/tid(1798223)
Aug 03 12:58:36 Compy-5600X kernel: input: Microsoft X-Box 360 pad 0 as /devices/virtual/input/input105
Aug 03 13:00:04 Compy-5600X systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Aug 03 13:00:04 Compy-5600X systemd[1]: sysstat-collect.service: Deactivated successfully.
Aug 03 13:00:04 Compy-5600X systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32798, for process WolfNewOrder_x6 pid 1798131 thread WolfNewOrd:cs0 pid 1798172)
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:   in page starting at address 0x0000e8674353a000 from client 0x1b (UTCL2)
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401430
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (data) (0xa)
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          MORE_FAULTS: 0x0
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          WALKER_ERROR: 0x0
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          MAPPING_ERROR: 0x0
Aug 03 13:00:33 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu:          RW: 0x0
Aug 03 13:00:43 Compy-5600X kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=815617255, emitted seq=815617257
Aug 03 13:00:43 Compy-5600X kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process WolfNewOrder_x6 pid 1798131 thread WolfNewOrd:cs0 pid 1798172
Aug 03 13:00:43 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset begin!
Aug 03 13:00:44 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: MODE1 reset
Aug 03 13:00:44 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: GPU mode1 reset
Aug 03 13:00:44 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: GPU smu mode1 reset
Aug 03 13:00:44 Compy-5600X kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 03 13:00:44 Compy-5600X kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000F00000).
Aug 03 13:00:44 Compy-5600X kernel: [drm] VRAM is lost due to GPU reset!
Aug 03 13:00:44 Compy-5600X kernel: [drm] PSP is resuming...
Aug 03 13:00:44 Compy-5600X kernel: [drm] reserve 0xa00000 from 0x83fd000000 for PSP TMR
Aug 03 13:00:45 Compy-5600X plasmashell[1970]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 03 13:00:45 Compy-5600X steam[4057]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 03 13:00:45 Compy-5600X steam[4057]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 03 13:00:45 Compy-5600X steam[4057]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 03 13:00:45 Compy-5600X steam[4057]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 03 13:00:45 Compy-5600X steam[4057]: amdgpu: amdgpu_cs_query_fence_status failed.

I can also get the Proton logs if you still need them, but that will have to come later, since it's an ordeal to plan for a scenario where my desktop will crash.

[–] zelifcam@lemmy.world 4 points 3 months ago (1 children)

This is just a guess, but I imagine kubuntu probably has a slightly dated kernel / mesa versions. My understanding is that As much as we wouldn’t like it to be the case, there’s still issues with AMD. It’s not totally perfect and Improvements and fixes are being added to the kernel and mesa packages all the time.

Maybe someone on AMD can confirm?

I own the game and did not experience a crash when playing through it on my nvidia system last year.

[–] ampersandrew@lemmy.world 1 points 3 months ago* (last edited 3 months ago) (2 children)

To be fair, I can't find evidence of anyone on the internet experiencing this same freeze, so it's probably more specific to me than AMD in general. When I saw freezes like this in Monster Hunter, years ago, I was on Nvidia.

[–] zelifcam@lemmy.world 2 points 3 months ago* (last edited 3 months ago) (1 children)

Have you ever tested your RAM? Like a proper memtest overnight?

If I ever suspect hardware, I also stress test the cpu with prime95.

[–] ampersandrew@lemmy.world 1 points 3 months ago (1 children)

I could give it a shot. Will it test VRAM too? Last I ran memtest was years ago. It needs a boot device like a USB, right?

[–] zelifcam@lemmy.world 2 points 3 months ago* (last edited 3 months ago) (1 children)

vram

No. It only tests system RAM.

USB

Yes. https://memtest.org

[–] ampersandrew@lemmy.world 1 points 3 months ago (1 children)

Thanks. I'll give it a go. I don't think I'm convinced it's a hardware issue, since that error says something about permissions and faulty IDs, but what do I know? Couldn't hurt to check.

[–] zelifcam@lemmy.world 1 points 3 months ago* (last edited 3 months ago) (1 children)

I’m not so sure it is either, but since you’re on kubuntu you don’t easily have access to newer mesa/kernel to test software.

Troubleshooting is not about finding the issue right away, it’s about eliminating by going through all of it and trying it to narrow it down.

Have you tried a distro with a bit more recent packages? If it also crashes then we know it’s more likely something with your hardware.

[–] ampersandrew@lemmy.world 1 points 3 months ago (1 children)

I'm not in a position to test on more than just these two machines/distros. Once upon a time, I tried switching to Fedora, but some of the behaviors were not to my liking, so I went back to Kubuntu.

[–] zelifcam@lemmy.world 1 points 3 months ago (2 children)

I’ll end it here.

If you are curious I’d check out the following link and see what version of mesa you are running vs the latest. Maybe you can find more info or other experiences from users on that specific version.

https://linuxcapable.com/how-to-upgrade-mesa-drivers-on-ubuntu-linux/

If you want to explore a newer kernel: https://ubuntuhandbook.org/index.php/2023/08/install-latest-kernel-new-repository/amp/

Finally, I’m curious if you boot using USB into a live distro like Nobara, if you can manage to test the game. Or even actually install Nobara to a spare USB drive drive and test the game.

Maybe someone else can chime in, but that’s about all I can think of. If you want to find a root cause, you have to run through and find what’s not the issue.

[–] rotopenguin@infosec.pub 1 points 3 months ago

I would recommend against installing ppas in general.

I think that Kubuntu/ubuntu Noble is on a pretty recent stable kernel as is. There would be something even fresher in the HWE track, dunno if that exists for noble yet. The DXVK version is up to Proton (so Proton-GE would be slightly fresher).

The Mesa version, I'm not sure where that comes from. You have an OS installed copy, you have a flatpak/snap version, but aiui Steam Runtime and/or Proton also likes to bring its own version.

Better gpu crash handling is a todo on Linux.

https://ubuntu.com/kernel/lifecycle https://www.phoronix.com/news/AMDGPU-Per-Ring-Resets

[–] ampersandrew@lemmy.world 1 points 3 months ago* (last edited 3 months ago) (1 children)

I understand the nature of troubleshooting, but I don't think testing a 45 GB game is feasible off of a live distro, and any way to test it on my hardware outside of logs is a whole lot of work to get one game working; I don't have a spare drive around either. I just figured these errors would mean something to someone who could take action on it to make a better Proton for everyone. I haven't checked the kernel yet, but my version of mesa matches the latest stable version in your link.

EDIT: I did find this link that sounds like it's a mesa bug. I'm on the same major version but a different minor version.

[–] zelifcam@lemmy.world 2 points 3 months ago* (last edited 3 months ago) (1 children)

FYI

I don't think testing a 45 GB game is feasible off of a live distro

Since AMD GFX works out of the box and steam is preinstalled in some live distro’s… simply mounting the hard drive with the game and then adding it to steam will work. Allows you to add the game ( steamlibrary) without having to download. Just a thought.

[–] ampersandrew@lemmy.world 1 points 3 months ago* (last edited 2 months ago)

Thanks, I'll give it a try, probably over the weekend with Nobara.

EDIT: In case anyone finds this later, I just tested this with Nobara 40 and had a similar result. Nobara was on kernel 6.10 compared to my 6.8. The game ran fine from beginning to end on SteamOS with a custom kernel branched off of 6.1.52. I'm still operating under the assumption that this is a mesa bug. I don't have another machine to test this with and rule out hardware issues, but this is the information I collected, and besides, it's the only game exhibiting problems for me at this point in time.

EDIT to that EDIT: The New Order didn't run perfectly from beginning to end, I just remembered. It crashed back to the SteamOS menu once and hard froze much like my desktop once, but even that freeze was gracefully caught well enough that I could still use the Steam menu to force quit the program, unlike my experience on desktop.

[–] kugmo@sh.itjust.works 2 points 3 months ago

If it's an amdgpu bug try reporting it to the amd linux gitlab