I think I've stopped the kernel crashes. I didn't have it crash at all without Ray Tracing and Strand Hair enabled, but I wasn't enjoying it as much without those things. The ambience looks much better with the RT, and the hair looked like shit without Strand Hair. My character has a nice, dark gold, bouncy pony tail and it looked like a hunk of cheap plastic halloween costume hair without it.
It was vkd3d-proton's pipeline cache. I knew it had something to do with vkd3d-proton because of the drastic change in the behaviour (much, much more frequent) between those two versions. It's pretty clever with shaders by default, it caches the DXIL to SPIRV conversion primitives so shaders can be compiled offline (during load screens/at the same time as DirectX shaders) and linked into the pipeline at runtime.
From the README.md file in the build dir:
## Shader cache
By default, vkd3d-proton manages its own driver cache.
This cache is intended to cache DXBC/DXIL -> SPIR-V conversion.
This reduces stutter (when pipelines are created last minute and app relies on hot driver cache)
and load times (when applications do the right thing of loading PSOs up front).
Behavior is designed to be close to DXVK state cache.
#### Default behavior
`vkd3d-proton.cache` (and `vkd3d-proton.cache.write`) are placed in the current working directory.
Generally, this is the game install folder when running in Steam.
#### Custom directory
`VKD3D_SHADER_CACHE_PATH=/path/to/directory` overrides the directory where `vkd3d-proton.cache` is placed.
#### Disable cache
`VKD3D_SHADER_CACHE_PATH=0`disables the internal cache, and any caching would have to be explicitly managed
by application.
I used the variable at the bottom that I emboldened, to disable that tomfoolery. (I deleted the existing vkd3d-proton.cache files in the game directory first).
This isn't the mesa vulkan shaders, and what this would do is just revert back to the old behaviour of back end shaders compiling on the pipeline and being cached. It's kind of worded funny, it calls it a driver cache (that's not what it is, it's cached pipeline state and shader IR's) but by "explicitly managed by application" he's only talking about the pipeline cache. The game does manage a pipeline cache, but it's for the DirectX 12 shaders which would have to be translated for Vulkan.
It has not harmed performance, if anything it has improved. (It probably wasn't handling it very well in this case anyway). I wonder if there are other games that would benefit from disabling that mechanism.
I played all night like that last night, but finally had the game freeze/crash tonight after about an hour or so, when some magic spewing enemies spawned. It was only a game crash, not the kernel and it didn't harm my X session. I did CTRL-ALT-F2 and surprise surprise, got the login prompt. By the time I logged in as grogan and did ps aux, the game was already gone. So I logged out and flipped back and the EA App and Lutris were still there. I rebooted, but only to see that it would and it did so without any systemd fuckery.
So I can live with that, I don't care if the game crashes occasionally if it just crashes and aborts. It's very good with the autosaves.
P.S. More interesting observations. I went back to vkd3d-proton 2.14 again (this is the latest stable, there haven't been any commits since) and the game crashed in less than a minute, as soon as I got to the first fight. Also, interestingly, I still do get mesa shaders compiled at the same time, my mesa_shader_cache_db grew by about 125 Mb. So that's good. The game knows the pipeline has changed, and triggers a shader recompile (it says "verifying shaders" and it takes a few minutes) on the initial load screens.
It was still just a game application crash. A directx error dialog. I didn't read it, it was out of focus and I just killed the game with the stop button on Lutris. Disabling the vkd3d-proton pipeline cache mechanism does indeed stop the kernel crashes and hard boots.
So back to vkd3d-proton 2.13 (git master soon before it went to 2.14) for that game since it crashed right away with 2.14. I might even try going back one more build, I still have a 2.13-master_old archived.
P.P.S. I went back to my 2.13-master_old and it crashed pretty soon, so back to the build that was working for this game. The amdgpu driver reacted with a protection fault (page fault type error) and recovered seamlessly.
P.P.P.S. It seems I had a crashy spot in the game state, with this vkd3d configuration and no version would get past it. I temporarily turned off Ray Tracing and got through it. Then I saved and quit and changed the prefix to use vkd3d-proton 2.14 and it seems to be doing OK at this time. I'll play it like that and see how often it crashes, now that the crashes are just an inconvenience again.
I do actually intend to do a third playthrough. I'm playing as the same character in the second one, just a different faction. I could also play as a mage with a magic staff, or a dwarf with a bow for my ranged weapon (my character has a shield toss).