New Kernel

Post by **Zema Bus** » Tue Nov 12, 2024 9:08 am

I heard something about that, I'll read through this later when I have more time.

Post by **Grogan** » Tue Nov 12, 2024 9:56 pm

I'm giving up on NTSYNC, there's no benefit at this time. I built a proton with wine staging 9.21 last night with it and it works, but not better than my valve bleeding edge proton using fsync.

I built a system wine with NTSYNC today, though I ended up having to redo it using "plain" (non-staging) wine 9.21 because it compiled, but wouldn't run (some dlls weren't working properly). The experiment was to see if Bioshock Remastered, which doesn't work correctly with esync or fsync (texture loading that never resolves), would work with NTSYNC and it does! (but it had no performance problem with plain wineserver sync anyway)

However, I then tried my Metro Exodus Enhanced game (GoG edition) and it just freezes on the load screens. My guess would be that it needs staging patches rather than ntsync being the cause, but either way, we can't have that.

So I'm going to take that ntsync out of my kernel for now. I'll try building it as a module (I hate modules) but only if it doesn't load automatically, otherwise, I'll go back and disable it (I could also blacklist it and load it manually if desired but it's not worth the trouble if I'm not going to use it anyway).

As far as the kernel goes, I think these cachyos patches are good for gaming. It seems to have solved some performance regressions... I haven't been that happy lately, with hard to put my finger on performance degradation. It kind of made me feel like maybe it was my imagination, but no. It wasn't wine/proton, it wasn't mesa... I think it was the kernel. It would start out OK, then after a while playing a big game it would get a bit sluggish and hitchy and I'd need to save and reload the game (e.g. Starfield, Assassin's Creed Odyssey, even Witcher 3 in Lutris. Fallout London seemed better last night too, but that could just be the areas I was in)

Post by **Grogan** » Thu Nov 14, 2024 7:31 pm

Linux 6.11.8
https://cdn.kernel.org/pub/linux/kernel ... Log-6.11.8

Some block layer fixes, among other things:

block: fix queue limits checks in blk_rq_map_user_bvec for real

[ Upstream commit be0e822bb3f5259c7f9424ba97e8175211288813 ]

blk_rq_map_user_bvec currently only has ad-hoc checks for queue limits,
and the last fix to it enabled valid NVMe I/O to pass, but also allowed
invalid one for drivers that set a max_segment_size or seg_boundary
limit.

Fix it once for all by using the bio_split_rw_at helper from the I/O
path that indicates if and where a bio would be have to be split to
adhere to the queue limits, and it returns a positive value, turn that
into -EREMOTEIO to retry using the copy path.

block: rework bio splitting

[ Upstream commit b35243a447b9fe6457fa8e1352152b818436ba5a ]

The current setup with bio_may_exceed_limit and __bio_split_to_limits
is a bit of a mess.

Change it so that __bio_split_to_limits does all the work and is just
a variant of bio_split_to_limits that returns nr_segs. This is done
by inlining it and instead have the various bio_split_* helpers directly
submit the potentially split bios.

To support btrfs, the rw version has a lower level helper split out
that just returns the offset to split. This turns out to nicely clean
up the btrfs flow as well.

Well, well... some Transparent Hugepage fixes. This could have been related to my regressions, and the reason those cachyos patches seemed to help so much (as I said I was primarily interested in the THP Shrinker that's included in the 0001-cachyos-base-all.patch). This is a very long writeup:

mm/thp: fix deferred split unqueue naming and locking

commit f8f931bba0f92052cf842b7e30917b1afcc77d5a upstream.

Recent changes are putting more pressure on THP deferred split queues:
under load revealing long-standing races, causing list_del corruptions,
"Bad page state"s and worse (I keep BUGs in both of those, so usually
don't get to see how badly they end up without). The relevant recent
changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin,
improved swap allocation, and underused THP splitting.

Before fixing locking: rename misleading folio_undo_large_rmappable(),
which does not undo large_rmappable, to folio_unqueue_deferred_split(),
which is what it does. But that and its out-of-line __callee are mm
internals of very limited usability: add comment and WARN_ON_ONCEs to
check usage; and return a bool to say if a deferred split was unqueued,
which can then be used in WARN_ON_ONCEs around safety checks (sparing
callers the arcane conditionals in __folio_unqueue_deferred_split()).

Just omit the folio_unqueue_deferred_split() from free_unref_folios(), all
of whose callers now call it beforehand (and if any forget then bad_page()
will tell) - except for its caller put_pages_list(), which itself no
longer has any callers (and will be deleted separately).

Swapout: mem_cgroup_swapout() has been resetting folio->memcg_data 0
without checking and unqueueing a THP folio from deferred split list;
which is unfortunate, since the split_queue_lock depends on the memcg
(when memcg is enabled); so swapout has been unqueueing such THPs later,
when freeing the folio, using the pgdat's lock instead: potentially
corrupting the memcg's list. __remove_mapping() has frozen refcount to 0
here, so no problem with calling folio_unqueue_deferred_split() before
resetting memcg_data.

That goes back to 5.4 commit 87eaceb3faa5 ("mm: thp: make deferred split
shrinker memcg aware"): which included a check on swapcache before adding
to deferred queue, but no check on deferred queue before adding THP to
swapcache. That worked fine with the usual sequence of events in reclaim
(though there were a couple of rare ways in which a THP on deferred queue
could have been swapped out), but 6.12 commit dafff3f4c850 ("mm: split
underused THPs") avoids splitting underused THPs in reclaim, which makes
swapcache THPs on deferred queue commonplace.

Keep the check on swapcache before adding to deferred queue? Yes: it is
no longer essential, but preserves the existing behaviour, and is likely
to be a worthwhile optimization (vmstat showed much more traffic on the
queue under swapping load if the check was removed); update its comment.

Memcg-v1 move (deprecated): mem_cgroup_move_account() has been changing
folio->memcg_data without checking and unqueueing a THP folio from the
deferred list, sometimes corrupting "from" memcg's list, like swapout.
Refcount is non-zero here, so folio_unqueue_deferred_split() can only be
used in a WARN_ON_ONCE to validate the fix, which must be done earlier:
mem_cgroup_move_charge_pte_range() first try to split the THP (splitting
of course unqueues), or skip it if that fails. Not ideal, but moving
charge has been requested, and khugepaged should repair the THP later:
nobody wants new custom unqueueing code just for this deprecated case.

The 87eaceb3faa5 commit did have the code to move from one deferred list
to another (but was not conscious of its unsafety while refcount non-0);
but that was removed by 5.6 commit fac0516b5534 ("mm: thp: don't need care
deferred split queue in memcg charge move path"), which argued that the
existence of a PMD mapping guarantees that the THP cannot be on a deferred
list. As above, false in rare cases, and now commonly false.

Backport to 6.11 should be straightforward. Earlier backports must take
care that other _deferred_list fixes and dependencies are included. There
is not a strong case for backports, but they can fix cornercases.

It looks like the cachyos kernels are updated to 6.11.8 so I should be able to grab appropriate patches.

P.S. Umm, nope... just because they have committed an update to their config to 6.11.8 doesn't mean they have finished the patches. They haven't changed (I diffed them as I was leery). Those changes to THP split had to affect the files being patched.

Code: Select all

Hunk #10 FAILED at 3304.
Hunk #11 succeeded at 3337 with fuzz 2 (offset 13 lines).
Hunk #12 succeeded at 3357 (offset 17 lines).
Hunk #13 succeeded at 3395 (offset 17 lines).
Hunk #14 succeeded at 3435 (offset 17 lines).
Hunk #15 succeeded at 3451 (offset 17 lines).
Hunk #16 succeeded at 3463 (offset 17 lines).
1 out of 16 hunks FAILED -- saving rejects to file mm/huge_memory.c.rej

I guess I'll wait on this, I'm quite happy with my 6.11.7 with those patchsets.

Post by **Grogan** » Sun Nov 17, 2024 8:44 pm

They released a 6.11.9 today:
https://cdn.kernel.org/pub/linux/kernel ... Log-6.11.9

Nothing exciting... some intel xe DRM fixes, nvme fixes that don't look like they'd bother us, bpf fixes (I have support for it, but I don't think I actually use that for anything now. Proton initializes it for some anticheat support, but I don't use that either. I don't actually use the pluggable schedulers (with sched-ext) that is getting patched in with the cachyos patches etc.)

I still don't see how CachyOS is building these kernels with their patches, that 0001-cachyos-base-all.patch with the THP Shrinker (that fails to patch due to the THP changes in 6.11.8... and it's more than a rebase) is still in the PKGBUILDs. As far as I know, makepkg aborts on failed patches, so somebody must not be eating their own dogfood.

Seeing as I'm happy with the patched 6.11.7 kernel, I think I'll just stay with it, for now.

Post by **Grogan** » Sun Nov 17, 2024 11:11 pm

I think I fixed their 0001-cachyos-base-all.patch for me, for 6.11.9... I found the appropriate function to insert the code in huge_memory.c, also removed some inappropriate hunks from elsewhere (one for amd/gpu/display that's already incorporated, and one for arm64 that's now a symlink to a file already patched) and generated a new one.

I'm not sure if I'm going to share it yet (I may post it to their github issues). I'm running it right now, but we'll have to see how it goes tonight running this kernel under load. The function is different, and defined differently but it has similar code and I think I'm OK to insert that, in context of the other changes that are still valid. It's adding functionality.

P.S. I did decide to post it. I did work on it, so might as well share it. I decided not to post my regenerated patch though, as it's now 6.11.9 specific (removed inappropriate hunks that have already been backported, as well as an arm64 patch that's now a symlink to a generic file that's already being patched).

https://github.com/CachyOS/linux-cachyos/issues/330

P.S. It looks like I had a stale file from web caching or something. The 0001-cachyos-base-all.patch was updated 11 hours ago. I downloaded and diffed it before I started today and it hadn't changed

Actually nope, that's still not what's in the source array of the PKGBUILD, it's still pointing to the old 0001-cachyos-base-all.patch. Oh well, it doesn't matter to me. I have downloaded the correct patch now.

Btw... they did exactly what I did

Post by **Grogan** » Mon Nov 18, 2024 5:17 am

Linux 6.12 is out today, too. I thought it might be delayed since they released a 6.11.9 first.

I'll get a look at that (depends on the patches I intend to use) tomorrow.

Post by **Zema Bus** » Mon Nov 18, 2024 8:55 am

Looks like something that would happen to me for sticking my neck out lol!

I just did 6.12, I knew it was coming soon but I forgot all about with some recent distractions.

Post by **Grogan** » Mon Nov 18, 2024 9:09 pm

I got my 6.12.0 kernel done now. I decided to do something different. CachyOS builds theirs with Clang, so I did too this time, so I could enable building with LTO. I was going to try profiling, but that's going to need more research because it's not automated and you have to build with Clang AutoFDO and Clang Propeller enabled to generate profile data, then install and boot it, and run load test suites through a profiler (e.g. "perf" part of linux-tools). I'm good with that up to <load tests>. They don't tell you what load tests to run.

Anyway, to build a kernel with clang, LLVM=1 goes in every make command. This implies a set of options, including using lld as the linker.

make LLVM=1 oldconfig
make LLVM=1 menuconfig
make LLVM=1 -j24
make LLVM=1 modules_install

LTO is enabled under "architecture dependent options" in the kernel config. With the cachyos patches, I have enabled -O3 and -march=alderlake for the build.

This took 4m3s instead of 1m30s to build and my bzImage is 6652928 (~6.6 Mb), probably only about 650K larger. I used full LTO, which should result in a smaller result than thin LTO.

Unfortunately, the THP-shrinker is not included in the patches for 6.12... after all that yesterday, now I don't get to use that. I'm considering just disabling Transparent Hugepages or setting it to madvise (applications can turn it on).

We'll have to see what happens now.

Post by **Grogan** » Fri Nov 22, 2024 7:20 pm

Linux 6.12.1 already, but it doesn't have much. It looks like an important fix precipitated it.

https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.1

mm/mmap: fix __mmap_region() error handling in rare merge failure case

The mmap_region() function tries to install a new vma, which requires a
pre-allocation for the maple tree write due to the complex locking
scenarios involved.

Recent efforts to simplify the error recovery required the relocation of
the preallocation of the maple tree nodes (via vma_iter_prealloc()
calling mas_preallocate()) higher in the function.

The relocation of the preallocation meant that, if there was a file
associated with the vma and the driver call (mmap_file()) modified the
vma flags, then a new merge of the new vma with existing vmas is
attempted.

During the attempt to merge the existing vma with the new vma, the vma
iterator is used - the same iterator that would be used for the next
write attempt to the tree. In the event of needing a further allocation
and if the new allocations fails, the vma iterator (and contained maple
state) will cleaned up, including freeing all previous allocations and
will be reset internally.

Upon returning to the __mmap_region() function, the error is available
in the vma_merge_struct and can be used to detect the -ENOMEM status.

Hitting an -ENOMEM scenario after the driver callback leaves the system
in a state that undoing the mapping is worse than continuing by dipping
into the reserve.

A preallocation should be performed in the case of an -ENOMEM and the
allocations were lost during the failure scenario. The __GFP_NOFAIL
flag is used in the allocation to ensure the allocation succeeds after
implicitly telling the driver that the mapping was happening.

The range is already set in the vma_iter_store() call below, so it is
not necessary and is dropped.

I just woke up, and that's a pretty confusing write-up

Also:

media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format

commit ecf2b43018da9579842c774b7f35dbe11b5c38dd upstream.

This can lead to out of bounds writes since frames of this type were not
taken into account when calculating the size of the frames buffer in
uvc_parse_streaming.

I don't use uvcvideo (no webcam or anything) but that's also an important looking fix, for that.

Lastly, a fix for Hyper-V sockets:

hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer

commit e629295bd60abf4da1db85b82819ca6a4f6c1e79 upstream.

When hvs is released, there is a possibility that vsk->trans may not
be initialized to NULL, which could lead to a dangling pointer.
This issue is resolved by initializing vsk->trans to NULL.

That's all that's in the point release.

Post by **Grogan** » Tue Dec 03, 2024 10:05 pm

So I'm still using the CachyOS patches, they are really on this. I'm glad that dev pointed out the real location for their kernel patches (the PKGBUILD had the wrong paths, to the old ones... something about eating their own dogfood as I often like to say. Obviously that's not what they use to build their kernel packages).

They have been updating the BORE scheduler ("Burst-Oriented Response Enhancer") in their sched-dev directory (a version 5.8.3 now). It's not so much a scheduler in itself, but an enhancement to the existing kernel schedulers (default is EEVDF in Linux 6.12 now) that tracks "burstiness" and has algorithms that assign a value to processes and gives priority to loads with shorter burst times.

I think that's been good for gaming, because gaming is all about bursty CPU loads. You'd have your main game process but also other processes/threads that have to cut through, like audio, shader compiling etc. I'm pretty sure my gaming performance improvements (well, the absence of the regressions I was suffering from) is due to the kernel, whatever it is that has been fixed or enhanced.

I don't think that THP-shrinker is necessary anymore. I think that commit I pointed out (mm/thp: fix deferred split unqueue naming and locking) may have obviated it in 6.11.8 and 6.12. Or maybe transparent hugepages had nothing to do with my problem... though it fit the symptoms.

I never thought I'd see the day when I'm compiling my kernel with clang. I essentially dismissed that as silly ("why would you want to when the kernel is meant to compile with gcc"), I'm still not sure if -O3, -march=alderlake (especially since they disable all the vector instructions anyway) and LTO have merit, but my results are good, so I'm sticking with all of this.

Post by **Grogan** » Thu Dec 05, 2024 8:26 pm

Linux 6.12.2 now. It's been a while, and the previous .1 release didn't even have many changes. This one has more meat.

https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.2

A lot of block layer related fixes today

block: always verify unfreeze lock on the owner task

commit 6a78699838a0ddeed3620ddf50c1521f1fe1e811 upstream.

commit f1be1788a32e ("block: model freeze & enter queue as lock for
supporting lockdep") tries to apply lockdep for verifying freeze &
unfreeze. However, the verification is only done the outmost freeze and
unfreeze. This way is actually not correct because q->mq_freeze_depth
still may drop to zero on other task instead of the freeze owner task.

Fix this issue by always verifying the last unfreeze lock on the owner
task context, and make sure both the outmost freeze & unfreeze are
verified in the current task.

block, bfq: fix bfqq uaf in bfq_limit_depth()

[ Upstream commit e8b8344de3980709080d86c157d24e7de07d70ad ]

Set new allocated bfqq to bic or remove freed bfqq from bic are both
protected by bfqd->lock, however bfq_limit_depth() is deferencing bfqq
from bic without the lock, this can lead to UAF if the io_context is
shared by multiple tasks.

A fix for the built in shell (e.g. if the kernel can't find init)

sh: intc: Fix use-after-free bug in register_intc_controller()

[ Upstream commit 63e72e551942642c48456a4134975136cdcb9b3c ]

In the error handling for this function, d is freed without ever
removing it from intc_list which would lead to a use after free.
To fix this, let's only add it to the list after everything has
succeeded.

NFS and SUNRPC, CIFS/SMB and more networking related stuff. A few drm/amd/display fixes as usual.

Fairly significant this time.

P.S. While configuring I noticed some new mm tunables. These are added by patchsets included by CachyOS, not mainline.

(15) Default value for vm.anon_min_ratio
(0) Default value for vm.clean_low_ratio
(15) Default value for vm.clean_min_ratio

They seem to be for better swap behaviour. I left them at their defaults.

CONFIG_ANON_MIN_RATIO:
│
│ This option sets the default value for vm.anon_min_ratio sysctl knob.
│
│ The vm.anon_min_ratio sysctl knob provides *hard* protection of
│ anonymous pages. The anonymous pages on the current node won't be
│ reclaimed under any conditions when their amount is below
│ vm.anon_min_ratio. This knob may be used to prevent excessive swap
│ thrashing when anonymous memory is low (for example, when memory is
│ going to be overfilled by compressed data of zram module).
│
│ Setting this value too high (close to MemTotal) can result in
│ inability to swap and can lead to early OOM under memory pressure.

CONFIG_CLEAN_LOW_RATIO:
│ This option sets the default value for vm.clean_low_ratio sysctl knob.
│
│ The vm.clean_low_ratio sysctl knob provides *best-effort*
│ protection of clean file pages. The file pages on the current node
│ won't be reclaimed under memory pressure when the amount of clean file
│ pages is below vm.clean_low_ratio *unless* we threaten to OOM.
│ Protection of clean file pages using this knob may be used when
│ swapping is still possible to
│ - prevent disk I/O thrashing under memory pressure;
│ - improve performance in disk cache-bound tasks under memory
│ pressure.
│
│ Setting it to a high value may result in a early eviction of anonymous
│ pages into the swap space by attempting to hold the protected amount
│ of clean file pages in memory.

CONFIG_CLEAN_MIN_RATIO:
│
│ This option sets the default value for vm.clean_min_ratio sysctl knob.
│
│ The vm.clean_min_ratio sysctl knob provides *hard* protection of
│ clean file pages. The file pages on the current node won't be
│ reclaimed under memory pressure when the amount of clean file pages is
│ below vm.clean_min_ratio. Hard protection of clean file pages using
│ this knob may be used to
│ - prevent disk I/O thrashing under memory pressure even with no free
│ swap space;
│ - improve performance in disk cache-bound tasks under memory
│ pressure;
│ - avoid high latency and prevent livelock in near-OOM conditions.
│
│ Setting it to a high value may result in a early out-of-memory condition
│ due to the inability to reclaim the protected amount of clean file pages
│ when other types of pages cannot be reclaimed.

Post by **Grogan** » Thu Dec 05, 2024 9:27 pm

Huh... my kernel won't compile like this, this time. I can't even remember the last time I had a kernel build failure. It's something in mm, hard to tell exactly what (linker error I think).

Code: Select all

make[3]: *** [scripts/Makefile.build:229: mm/vmscan.o] Error 1
...
...
make[2]: *** [scripts/Makefile.build:478: mm] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/storage/shit/build/archkern/linux-6.12.2/Makefile:1946: .] Error 2
make: *** [Makefile:224: __sub-make] Error 2

I've got some troubleshooting to do. The first thing I'm going to do is build it with gcc/ld instead of clang/lld. I'll have to drop the LTO (it's only for clang). I'm not sure if it's that, or something in their patches that makes changes to mm.

P.S. Nope, it's not the compiler, but I got a more intelligible result with gcc

Code: Select all

mm/vmscan.c:6102:9: error: implicit declaration of function ‘prepare_workingset_protection’ [-Wimplicit-function-declaration]
 6102 |         prepare_workingset_protection(pgdat, sc);

With a single job build with clang, with no LTO to obscure things

Code: Select all

mm/vmscan.c:6102:2: error: call to undeclared function 'prepare_workingset_protection'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]

That's work-aroundable with gcc (-Wno-error=implicit-function-declaration) but in this case I think that's just what caught the condition. It's likely a real problem.

Making it ignore the -Werror (in scripts/Makefile.extrawarn) does indeed just cause it to fail later, while linking the image.

It is quite likely something in the patches, as the kernel devs' build bots would catch things like this.

The last commit to the cachyos-patches was "6.12: Upstream is broken, revert offending commits". That obviously translates into "upstream breaks our patches, so hack upstream" so it figures

Definitely the patches. That function doesn't even exist like that in vmscan.c in the vanilla upstream source. Shit. I don't know what I'm going to do yet. Probably wait, I'd hate to drop them.

Post by **Grogan** » Fri Dec 06, 2024 5:15 am

Seeing as the 6.12 CachyOS patches are so hinkey, I've gone back to 6.11.9 for now, the last 6.11 patches that work (and the last 6.11 kernel I had). I think I had better game performance with that, but it's hard to tell as the grass is always greener on the other side. I think maybe the updates to the BORE scheduler I was using from their sched-dev directory may have been a regression. This way I'll see.

All I know is that that 6.12.2 cachy patchset can't possibly compile as is. I downloaded the whole repo and grepped for anything that would define that function and there's nothing outside of the patches I'm already applying.

I'm going to have to install CachyOS in a virtual machine so I can better see what they are doing in the distro, too.

Post by **Grogan** » Fri Dec 06, 2024 6:55 pm

Linux 6.12.3 already, but with just one change:

https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.3

sched: Initialize idle tasks only once

commit b23decf8ac9102fc52c4de5196f4dc0a5f3eb80b upstream.

Idle tasks are initialized via __sched_fork() twice:

fork_idle()
copy_process()
sched_fork()
__sched_fork()
init_idle()
__sched_fork()

Instead of cleaning this up, sched_ext hacked around it. Even when analyis
and solution were provided in a discussion, nobody cared to clean this up.

init_idle() is also invoked from sched_init() to initialize the boot CPU's
idle task, which requires the __sched_fork() invocation. But this can be
trivially solved by invoking __sched_fork() before init_idle() in
sched_init() and removing the __sched_fork() invocation from init_idle().

Do so and clean up the comments explaining this historical leftover.

I see my CachyOS patches have been way updated since yesterday, hopefully they'll work this time. I got thinking about it as I woke in the night, and I'll bet yesterdays DID compile for him, because it would have been with LLVM 18 (which is what I'm "supposed" to have in Arch, but pffftt... )

P.S. No, it still doesn't compile. I quickly downgraded my LLVM packages and tried it and it doesn't compile with LLVM 18 either. I don't know what the fuck they are doing to compile this, but it's caused by the vm tunable code in their patches. I guess I'll have to remove those changes. They have only recently added them, they come from this patchset that's not in the main kernel tree.

From: Alexey Avramov
To: linux-mm-AT-kvack.org
Subject: [PATCH] mm/vmscan: add sysctl knobs for protecting the working set
Date: Tue, 30 Nov 2021 20:16:52 +0900

I guess I'll have to cherry pick individual patches instead of using the "all" + bore-cachy and see. Pretty hard to do with complex patch sets that have them bundled together.

I might just stick with 6.11.9 for now, my system headers are for 6.11 anyway (Arch is actually still on 6.10 for linux-api-headers for glibc etc. but I don't let them hold me back) and I have great performance with that CachyOS kernel build.

Post by **Grogan** » Fri Dec 06, 2024 10:56 pm

I'm finally running Linux 6.12.3!

I removed the offending hunks from 0001-cachyos-base-all.patch (for safety's sake, I yanked out everything related that touches anything in "mm" not just those changes to vmscan.c and mm.h). I wasn't going to be using those new tunables anyway, and I already set some (like vm.swappiness and vm.max_map_count in sysctl.d) that were being set as defaults by the patch.

This way I still get most of the tweaks v.s. vanilla, and the BORE scheduler (0001-bore-cachy.patch) which depends on things in the base patch, or it won't apply.

We'll see if it's worth it or not, later tonight.

Post by **Grogan** » Sat Dec 07, 2024 10:19 pm

Well... I started to think these CachyOS patches are bollocks for Linux 6.12. I think they sweetened Linux 6.11 for me, but like my 6.12.1 build with the patches, I didn't have great performance last night (noticeable performance issue in The Last of Us) and had a game crash where the driver didn't recover while playing Veilguard. That type of crash usually results in getting knocked down to console after driver recovery at worst, but last night I was halted (with corrupt audio). Not even able to ssh in, no route to host. Those patchsets do a lot of shit to amdgpu as well, stuff that isn't in the main kernel tree, as well as zero fan control tunables. It's quite an invasive patchset, that "cachy-all". The compiler optimizations and LTO might not be a great idea either.

So I've gone back to the main, vanilla Linux 6.12.3 built the normal way with gcc without any cleverness. Perhaps I'll revisit those patches at a later time.

Third party kernel patches always fuck you up. I don't care if it's Nvidia, VirtualBox or clever optimization patches from people that think they know better, they always hold me back and cause problems when I want to upgrade.

P.S. I'm going to try the plain kernel tonight, but tomorrow I may try patching in the BORE scheduler enhancement itself, with no tweaks from CachyOS. The actual BORE repo is here, and it has patchsets, e.g. "0001-linux6.12.y-bore5.7.10.patch"

https://github.com/firelzrd/bore-scheduler

P.P.S. The vanilla kernel was good, so I ended up doing that BORE patch from firelzrd git and recompiling. I can't say if it's better, but my games are still running well. I couldn't resist doing "something" lol

Post by **Grogan** » Sun Dec 08, 2024 8:49 pm

I think I'm going to drop the BORE scheduler enhancement, it really shouldn't be necessary anymore as Linux 6.12 now uses the EEVDF scheduler ("Earliest Eligible Virtual Deadline First") which itself favours short running CPU tasks and can preempt longer ones. I'm probably just introducing more overhead by adding that "burstiness" tracking metric. Especially since my use case is NOT running other tasks while gaming (e.g. I'm not about to do very much while a compile job is going. Maybe a bit of light web browsing, but it's my build job that I want to get the CPU time in that case anyway. If not I'd simply start fewer make jobs)

The worst thing here with these patches and "pointless" optimizations (in quotes because the kernel devs think so for kernel code... and quite likely they know what they are talking about) is the feeling that I'm missing out on something. The placebo effect is strong with such tweaks.

I'm trying another clang build though, with thin LTO as those are native options. Without any patches, I'll be able to better see if that's "good" or not.

P.S. For now, I have settled on building with clang, no patches, -O2, and ThinLTO. I did, however, pencil in -march=alderlake in arch/x86/Makefile. I might as well tune the execution for my processor (the kernel build system disables all the vector instructions with -mno so essentially it's only tuning... should be harmless at the very least, if useless). The processor options in the kernel haven't changed in ~20 years.

I think this is probably the best it's going to get, barring any noticeable problems or regressions, and I'm not relying on any third party stuff. My kernel binary is only about 100 kb larger than a vanilla gcc build.

Post by **Grogan** » Mon Dec 09, 2024 7:28 pm

Linux 6.12.4
https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.4

I'm glad I got over my fad with patches... back to this being easy again. I think I've got a good method now.

Some DRM fixes and stuff. Just a few examples:

Revert "drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs"

commit 0191fddf53748cf2b473d78faeabe6dcb47689d2 upstream.

This reverts commit 55858fa7eb2f163f7aa34339fd3399ba4ff564c6.

'55858fa7eb2f ("drm/xe/xe_guc_ads: save/restore OA registers and allowlist
regs")' was not properly reviewed and also causes dmesg asserts in
CI. Revert it.

drm/amd/display: Remove PIPE_DTO_SRC_SEL programming from set_dtbclk_dto

commit a3e6079bd93d5c66a43bf6a5f90e5b98465dc7b3 upstream.

There are cases where an OTG is remapped from driving a regular HDMI
display to a DP/eDP display. There are also cases where DTBCLK needs to
be enabled for HPO, but DTBCLK DTO programming may be done while OTG is
still enabled which is dangerous as the PIPE_DTO_SRC_SEL programming may
change the pixel clock generator source for a mapped and running OTG and
cause it to hang.

Remove the PIPE_DTO_SRC_SEL programming from this sequence since it is
already done in program_pixel_clk(). Additionally, make sure that
program_pixel_clk sets DTBCLK DTO as source for special HDMI cases.

drm/amd/display: update pipe selection policy to check head pipe

commit 8fef253c94a5312b9150b2ff8e633b331bac7e88 upstream.

[Why]
No check on head pipe during the dml to dc hw mapping will allow illegal
pipe usage. This will result in a wrong pipe topology to cause mpcc tree
totally mess up then cause a display hang.

[How]
Avoid to use the pipe is head in all check and avoid ODM slice during
preferred pipe check.

drm/amd/display: Fix handling of plane refcount

commit 27227a234c1487cb7a684615f0749c455218833a upstream.

[Why]
The mechanism to backup and restore plane states doesn't maintain
refcount, which can cause issues if the refcount of the plane changes
in between backup and restore operations, such as memory leaks if the
refcount was supposed to go down, or double frees / invalid memory
accesses if the refcount was supposed to go up.

[How]
Cache and re-apply current refcount when restoring plane states.

drm/amdgpu: fix usage slab after free

Post by **Grogan** » Mon Dec 09, 2024 8:55 pm

For the kernel, Full LTO has a huge binary size penalty vs. ThinLTO... 1.1 Mb larger for my bzImage, 6.3 Mb vs. 5.2, and about 4 Mb larger vmlinux.o, 35 Mb vs. 31 (not yet stripped, assembled or compressed). That will most likely be detrimental for kernel code. (also longer build time... 4 minutes vs. 2.5 minutes for thin vs. 1.5 minutes for non-LTO) but I don't care about that. Also, and while I don't care about this either (I don't compile any Rust written drivers), cross-language optimization is not possible with full LTO.

So Clang ThinLTO is the way to go.

Google says they use thinLTO for everything.

Post by **Grogan** » Sun Dec 15, 2024 5:00 am

Oooh, new kernel while I was gone today, Linux 6.12.5

https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.5

I guess there will still be some repercussions for mainlining Real Time scheduling:

softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel

do_softirq_post_smp_call_flush() on PREEMPT_RT kernels carries a
WARN_ON_ONCE() for any SOFTIRQ being raised from an SMP-call-function.
Since do_softirq_post_smp_call_flush() is called with preempt disabled,
raising a SOFTIRQ during flush_smp_call_function_queue() can lead to
longer preempt disabled sections.

Since commit b2a02fc43a1f ("smp: Optimize
send_call_function_single_ipi()") IPIs to an idle CPU in
TIF_POLLING_NRFLAG mode can be optimized out by instead setting
TIF_NEED_RESCHED bit in idle task's thread_info and relying on the
flush_smp_call_function_queue() in the idle-exit path to run the
SMP-call-function.
...

A revert... it looks more like it's being reverted because it didn't solve a problem rather than because it caused one, though

Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"

commit 3c2296b1eec55b50c64509ba15406142d4a958dc upstream.

This reverts commit 2551b4a321a68134360b860113dd460133e856e5.

This was not the root cause. Revert.

Always power management related stuff... it must be tricky

drm/amdgpu: rework resume handling for display (v2)

commit 73dae652dcac776296890da215ee7dec357a1032 upstream.

Split resume into a 3rd step to handle displays when DCC is
enabled on DCN 4.0.1. Move display after the buffer funcs
have been re-enabled so that the GPU will do the move and
properly set the DCC metadata for DCN.

v2: fix fence irq resume ordering

Adding that case insensitive filesystem support is far more invasive than people realize (it adds language support and more complexity to interpreting unicode chars). I refuse to enable that shit in my kernels. (I don't need it... wine handles that)

commit 0a5152f5fbe7640d7aa8795269099e03565770e7
Author: Linus Torvalds <>
Date: Wed Dec 11 14:11:23 2024 -0800

Revert "unicode: Don't special case ignorable code points"

[ Upstream commit 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b ]

This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91.

It turns out that we can't do this, because while the old behavior of
ignoring ignorable code points was most definitely wrong, we have
case-folding filesystems with on-disk hash values with that wrong
behavior.

So now you can't look up those names, because they hash to something
different.

Of course, it's also entirely possible that in the meantime people have
created *new* files with the new ("more correct") case folding logic,
and reverting will just make other things break.

The correct solution is to not do case folding in filesystems, but
sadly, people seem to never really understand that. People still see it
as a feature, not a bug.

There's lots, lots more, see changelog if interested.

Post by **Grogan** » Tue Dec 17, 2024 1:16 am

I've been having graphics driver problems with this kernel. When Dragon Age Veilguard crashes, I get a fatal kernel crash. I had it happen with 6.12.4 once, but it became frequent with 6.12.5. I just assumed driver reset failed, but I found that I could VT switch if I press CTRL-ALT-F2 right away instead of waiting until things grind to a halt. I don't get a prompt, but I do see the kernel crash happening on console. I can't see the top most line, but it looks like a Null Pointer De-reference (the rest of the crash spaghetti is in line with that).

~~DPMS (in X11) seems to have stopped working too, the display never sleeps.~~

Edit: Actually DPMS is working, it seems something in the distro has stopped X11 from doing that. I used xset to turn it on and it works.

Also, today I had this happen, it was soon after starting up, I had Firefox open and I clicked the Get Mail button in Sylpheed on another virtual desktop. X seemed to freeze (no input) for a couple of seconds. It wasn't fatal so I was able to get the text

Code: Select all

Dec 16 13:20:52 nicetry kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State
Dec 16 13:20:52 nicetry kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State Completed
Dec 16 13:20:52 nicetry kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 timeout, signaled seq=1246, emitted seq=1246
Dec 16 13:20:52 nicetry kernel: ------------[ cut here ]------------
Dec 16 13:20:52 nicetry kernel: WARNING: CPU: 0 PID: 1354 at amdgpu_job_timedout.cold+0x1cb/0x2c5 [amdgpu]
Dec 16 13:20:52 nicetry kernel: Modules linked in: binfmt_misc input_leds joydev coretemp amdgpu kvm_intel kvm i2c_algo_bit drm_ttm_helper ttm ghash_clmulni_intel sha512_ssse3 drm_exec sha512_generic drm_suball>
Dec 16 13:20:52 nicetry kernel: CPU: 0 UID: 0 PID: 1354 Comm: kworker/u96:5 Not tainted 6.12.5 #2
Dec 16 13:20:52 nicetry kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D36/PRO Z690-P DDR4 (MS-7D36), BIOS A.H0 01/17/2024
Dec 16 13:20:52 nicetry kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Dec 16 13:20:52 nicetry kernel: RIP: 0010:amdgpu_job_timedout.cold+0x1cb/0x2c5 [amdgpu]
...
... backtrace etc.

This is all with absolutely vanilla Linux 6.12.5 (normal gcc build with no LTO or any other silliness)

I'm not too alarmed by the behaviour triggered by the game crash (games... all bets are off) but that's unusual to have occur. I hope it's not a sign of graphics card failure (probably not, probably more like the driver failed to power it up when activity needed it)

Anyway, I'm back to the Linux 6.11.9 with the CachyOS patches that I was so happy with and we'll see what happens. I suspect that the things they have done to amdgpu in Linux 6.12.4 and 5 have caused this. Hopefully my game will crash and I can see if amdgpu recovers with this kernel.

Post by **Grogan** » Tue Dec 17, 2024 9:01 pm

It makes it really hard, when so many things in the environment change. There's Arch constantly changing, but I make it worse by always updating kernel, mesa, wine, vkd3d-proton etc. and using git master for things.

When the game crashes, the 6.11.9 kernel still doesn't recover. It doesn't show an oops on console like 6.12.x if I VT switch while I can but it's likely the same kernel crash. It just shows a blinking cursor which stops blinking when it grinds to a halt in the same way and needs a hard boot, and filesystems fsck'd. If I act quickly I can ctrl-alt-del or press the power button for ACPI shutdown, but it won't shutdown or reboot as IRQs are disabled and systemd just goes through an endless loop of "a stop job is running for..." for several units. If I don't act quickly, once it grinds to a halt no further input is possible (probably because IRQs get disabled).

It used to be that this game did crash occasionally, but it would recover. Either with a DirectX error dialog saying the graphic card has been "removed" (driver reset lol) or at worst, a driver reset causing me to lose my X session, dropping me to console. I don't mind that, as it doesn't cause a bad shutdown.

I noticed this problem started around Mesa 24.3.1 (I don't like that release... I had performance regressions in some games. I was blaming the CachyOS kernel patches etc. for 6.12). The problem got worse with Mesa 25.0.0-devel, though I had great performance with it across the board. I mean, the Veilguard crashes happen far more often with Mesa 25 git master. So I went back to Mesa 24.3.0 and played only Veilguard with it, to reproduce the crashes (so I can't say anything about other games). While less frequent, when Veilguard crashes, it's still a non recoverable kernel crash.

So I went back to libdrm 2.4.123 which had also changed around the time that this problem started. It did crash once since doing that, and it was unrecoverable.

I also went back to last month's linux-firmware-241110.tar.xz in recreating the environment to what it was.

The last thing I'm trying is going back to my last build of vkd3d-proton 2.13. I had upgraded that to 2.14+ (git master) in the timeframe too. I wasn't able to trigger a crash since switching back to 2.13, but I'm going to keep going like this until it does and I'll see if driver recovery works before I put everything back to new again. It might take hours or even days.

What I'm starting to think is that it was the last game update that changed the behaviour, that's crashing my kernel when the game crashes. That was updated in the same timeframe too, a performance and reliability update.

I had not had trouble with any other games with the latest stuff, if they crash it's either just a game crash or at worst, a driver reset and getting knocked down to console (e.g. Assassin's Creed Valhalla has some areas in England where this might occasionally occur). It's just Veilguard causing the kernel crash. I'm going to be finished with Veilguard for a while after this second playthrough so it will soon be moot. I might also be able to stop the game from crashing by disabling things like Ray Tracing or "Single strand hair " (like HairWorks) but that's not the point. I was trying to find what changed that affected driver recovery.

Post by **Grogan** » Wed Dec 18, 2024 6:31 am

This doesn't excuse the kernel from crashing, but it looks like the culprit triggering it was vkd3d-proton 2.14. I got tired of waiting (I was even playing the game in the day time trying to get a crash), the game wasn't crashing with the wine prefix using my last vkd3d-proton 2.13 build (what it was back when this game was very rarely crashing) so I went back to all the new stuff, except vkd3d for that game. Linux 6.12.5 built with clang LTO (since that had nothing to do with the problem), the latest driver firmware, back to libdrm 2.4.124, and mesa 25.0.0-devel. I don't want to downgrade my shit just for one game anyway.

The game still isn't crashing, where it was crashing every few minutes (or even within a few footstep of starting it once) with Linux 6.12.5 and all, the other day (and I had tried downgrading mesa the night before). Performance is good, too. Previously, I'd get back after a complete cold boot and fsck etc. and it would crash right away again. This greatly improved with Linux 6.11 and mesa 24.3.0 and the rest of the stuff I downgraded, but it was still crashing more than usual. It's Lutris, so I can point it to whatever runtimes I want, per prefix, so I can continue to use this older vkd3d-proton for this game without affecting anything else. It's the only directx12 game I have in the EA App prefix at this time anyway. Veilguard was the only game I was having a problem with (I have vkd3d-proton 2.14 baked into my proton-tkg for Steam as well) so I want to keep up with vkd3d-proton.

Oh, and the X11 power management stopped working because I forgot I had to delete xscreensaver.desktop from /etc/xdg/autostart after the package got upgraded. That starts the daemon, and even though I'm not using it, I guess it disables X11's screen blanking and DPMS. I really ought to remove that package so this doesn't happen again, I just have it installed because I intended to manually play with them. (I like seeing them, especially to see if they added new ones, but I don't actually ever want screensavers running... that's a horrible thing to do). I figured that out because adding an xset command to my .xinitrc wasn't working... it was being overridden later than that. I was actually fixing to put a .desktop file in autostart to run xset, and that's when I saw xscreensaver.desktop and said doh!

Post by **Zema Bus** » Wed Dec 18, 2024 7:57 am

At least you were able to nearly complete two playthroughs before that issue showed up.

I kind of miss the screensaver era

Post by **Grogan** » Wed Dec 18, 2024 9:18 am

Back in 1995 I had a big After Dark screensavers collection that had assloads of mesmerizing screen savers. That was really cool back then. It was nasty too (moreover, it was a 16 bit program that loaded drivers in system.ini and shit) but didn't cause me any problems. I played with those for hours. My favourite one, and the one I left on was "Mountains" and it had a Mars setting. It generated terrain.

Post by **Zema Bus** » Wed Dec 18, 2024 6:54 pm

I had After Dark during around that same time, my older sister and brother-in-law gave me a copy. I couldn't see all of them because my 386-SX could only display 16 colors (onboard graphics). Eventually I had an AMD K5 with a video card, I think it was an early Nvidia card, then several years later someone in the Mandrake mailing list sent me his old ATI card since my old card wasn't playing well with Mandrake.

Post by **Grogan** » Wed Dec 18, 2024 9:27 pm

I got my first Windows computer in 1995 (I had an IBM XT clone that ran DOS 3 with Word Perfect and a dot matrix printer before that lol). It was a Pentium 120 with a Cirrus Logic PCI video card with 1 Mb of vram (whatever they called it back then). That was a pretty advanced card for back then. I asked somebody what a PCI video card was and they told me "The good kind" lol

It ran Windows 95 original OEM, which was good and reliable for me.

After that, I upgraded to a Pentium II 300 (one of the horizontal mounting SECC cartridges) with an ATI Rage Pro with 4 Mb of vram. With Windows 98, I had to ditch After Dark because it was causing blue screens. I used that hardware up until around 2002 I think... I used to compile KDE on it and it took about 8 hours back then (KDE 2.x). After that I got a Pentium 4 system.

Post by **Zema Bus** » Thu Dec 19, 2024 7:35 pm

6.12.6

Post by **Grogan** » Thu Dec 19, 2024 8:10 pm

I'm not seeing it yet, the mirrors must not be in sync. I'm sure I could get it (git tarball snapshot) but I want to see what's in it first.

The Veilguard game finally did crash last night and no, the driver recovery didn't work and the kernel crashed in the same way. It went a long time without crashing with vkd3d-proton 2.13 but ultimately, the problem is still there. I have exhausted all possibilities, so I can conclude it was the last game update that changed the crash behaviour. So now, I've resigned myself to turning off Ray Tracing and Strand Hair to see if that solves the problem. Some of the magic is gone though, there are some things that don't look quite as good. I especially notice the difference in the characters' hair (more matted).

Post by **Grogan** » Thu Dec 19, 2024 11:20 pm

It's live now... Linux 6.12.6 Changelog:

https://cdn.kernel.org/pub/linux/kernel ... Log-6.12.6

x86/static-call: fix 32-bit build

commit 349f0086ba8b2a169877d21ff15a4d9da3a60054 upstream.

In 32-bit x86 builds CONFIG_STATIC_CALL_INLINE isn't set, leading to
static_call_initialized not being available.

Define it as "0" in that case.

I'm glad they are trying not to break 32 bit builds of current kernels. There are still plenty of old devices out there.

Lots of updates pertaining to the Xen hypervisor.

Bluetooth fixes

Net fixes

Wifi fixes

ACPI fixes

XFS filesystem fixes

This looks significant:

block: Fix potential deadlock while freezing queue and acquiring sysfs_lock

[ Upstream commit be26ba96421ab0a8fa2055ccf7db7832a13c44d2 ]

For storing a value to a queue attribute, the queue_attr_store function
first freezes the queue (->q_usage_counter(io)) and then acquire
->sysfs_lock. This seems not correct as the usual ordering should be to
acquire ->sysfs_lock before freezing the queue. This incorrect ordering
causes the following lockdep splat which we are able to reproduce always
simply by accessing /sys/kernel/debug file using ls command...

This is the kind of stuff I'm watching for:

drm/amdkfd: hard-code MALL cacheline size for gfx11, gfx12

commit d50bf3f0fab636574c163ba8b5863e12b1ed19bd upstream.

This information is not available in ip discovery table.

drm/amdkfd: hard-code cacheline size for gfx11

commit 321048c4a3e375416b51b4093978f9ce2aa4d391 upstream.

This information is not available in ip discovery table.

drm/amdkfd: Dereference null return value

commit a592bb19abdc2072875c87da606461bfd7821b08 upstream.

In the function pqm_uninit there is a call-assignment of "pdd =
kfd_get_process_device_data" which could be null, and this value was
later dereferenced without checking.

drm/amdgpu: fix when the cleaner shader is emitted

commit f4df208177d02f1c90f3644da3a2453080b8c24f upstream.

Emitting the cleaner shader must come after the check if a VM switch is
necessary or not.

Otherwise we will emit the cleaner shader every time and not just when it is
necessary because we switched between applications.

This can otherwise crash on gang submit and probably decreases performance
quite a bit.

v2: squash in fix from Srini (Alex)

drm/amd/pm: Set SMU v13.0.7 default workload type

commit 3912a78cf72eb45f8153a395162b08fef9c5ec3d upstream.

Set the default workload type to bootup type on smu v13.0.7.
This is because of the constraint on smu v13.0.7.
Gfx activity has an even higher set point on 3D fullscreen
mode than the one on bootup mode. This causes the 3D fullscreen
mode's performance is worse than the bootup mode's performance
for the lightweighted/medium workload. For the high workload,
the performance is the same between 3D fullscreen mode and bootup
mode.

v2: set the default workload in ASIC specific file

drm/amdgpu: fix UVD contiguous CS mapping problem

commit 12f325bcd2411e571dbb500bf6862c812c479735 upstream.

When starting the mpv player, Radeon R9 users are observing
the below error in dmesg.

[drm:amdgpu_uvd_cs_pass2 [amdgpu]]
*ERROR* msg/fb buffer ff00f7c000-ff00f7e000 out of 256MB segment!

The patch tries to set the TTM_PL_FLAG_CONTIGUOUS for both user
flag(AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) set and not set cases.

v2: Make the TTM_PL_FLAG_CONTIGUOUS mandatory for user BO's.
v3: revert back to v1, but fix the check instead (chk).

Mikeserv Support Forum

New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel

Re: New Kernel