Linux 6.12 was released
Summary: This release includes realtime support (PREEMPT_RT), a feature that has been in the works for 20 years. It also includes complete support for the EEVDF task scheduler; the ability to write task scheduling algorithms using BPF; support for printing a QR code on panic screens with debug information; support for zero-copy receive TCP payloads to a DMABUF region of memory while packet headers land separately in normal kernel buffers; a new linux security modules that enforces that binaries must come from integrity-protected storage; support for Memory Protection Keys in ARM; and XFS support for block sizes larger than a memory page. As always, there are many other features, new drivers, improvements and fixes.
Real Time support
After 20 years of work, the real-time patchset has been merged
The idea behind realtime is to make as much kernel code fully preemptible as possible, which provides lower latencies (possibly at the expense of throughput).
During these two decades, the people working on the RT patchset had to write and rewrite a lot of code in order to support the realtime capabilities better. Most of that work turned out to be good for mainline aswell, and many features that have been incorporated into Linux during all this time had the RT patchset as origin, even if they didn't look like it. The final step, the rewrite of printk()
Recommended documentation:
Recommended LWN articles:
- A realtime preemption overview
- Revisiting the kernel's preemption models (part 1)
- A Q&A about the realtime patches
- The real realtime preemption end game
Complete the EEVDF task scheduler
In Linux 6.6
Recommended LWN article: Completing the EEVDF scheduler
Documentation: EEVDF Scheduler
BPF based task scheduling algorithms with sched_ext
Task scheduling algorithms are a complex topic, and experimentation or even personalization can provide great improvements. This release includes the first pieces of {{{sched_ext}}}, a feature that enables to write task scheduler algorithms in BPF, which provides a much faster development iteration and enables personalization of task scheduling
Recommended LWN article: The extensible scheduler class
Documentation:
QR codes on panic screens
Panic information is often hard to turn into text. This release adds an optional new panic screen, with a QR code and the kernel buffer (dmesg) data embedded. The kmsg data will be compressed with zlib and encoded as a numerical segment, and appended to the URL as a URL parameter. This allows to save space, and put about ~7500 bytes of kmsg data, in a V40 QR code. Linux distributions can customize the URL, and put a web frontend to directly open a bug report with the kmsg data.
Device Memory TCP for faster network device transfers
Device Memory TCP (devmem TCP) provides the ability to zero-copy receive TCP payloads to a DMABUF region of memory while packet headers land separately in normal kernel buffers
Today, the majority of the Device-to-Device data transfers to the network are implemented as the following low level operations: Device-to-Host copy, Host-to-Host network transfer, and Host-to-Device copy. The implementation is suboptimal, especially for bulk data transfers, and can put significant strains on system resources, such as host memory bandwidth, PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
Device Memory TCP (devmem TCP) attempts to optimize this use case by implementing socket APIs that enable the user to: 1. send device memory across the network directly, and 2. receive incoming network packets directly into device memory.
Recommended LWN article: Direct-to-device networking
Documentation: Device Memory TCP
Integrity Policy Enforcement to restrict execution to trusted binaries
Integrity Policy Enforcement is a new Linux security module that allows to restrict execution to only those binaries which come from integrity protected storage, e.g. fs-verity, dm-verity, or even initramfs.
Documentation: Integrity Policy Enforcement
perf ftrace profile, for better profiling
This release adds a 'perf ftrace profile' command to the perf tool that get function execution profiles using function-graph tracer so that users can see the total, average, max execution time as well as the number of invocations easily. The following is a profile for the perf_event_open syscall.
{{{
$ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
perf stat -e cycles -C1 true 2> /dev/null | head
# Total (us) Avg (us) Max (us) Count Function
65.611 65.611 65.611 1 __x64_sys_perf_event_open
30.527 30.527 30.527 1 anon_inode_getfile
30.260 30.260 30.260 1 __anon_inode_getfile
29.700 29.700 29.700 1 alloc_file_pseudo
17.578 17.578 17.578 1 d_alloc_pseudo
17.382 17.382 17.382 1 __d_alloc
16.738 16.738 16.738 1 kmem_cache_alloc_lru
15.686 15.686 15.686 1 perf_event_alloc
14.012 7.006 11.264 2 obj_cgroup_charge
}}}
ARM Permission Overlay Extension to support Memory Protection Keys
This release implements ARM support for Permission Overlay Extension, which allows to constrain permissions on memory regions. This can be used from userspace (EL0) without a system call or TLB invalidation. POE is used to implement the Memory Protection Keys
XFS support for block sizes larger than page size
This release adds VFS support for having block sizes larger than the page size (with XFS being the first filesystem that supports it)
Smaller struct file
{{{struct file}}}, the data structure used to keep information about an open file in Linux, has been reduced from 232 bytes to 184 bytes (3 cachelines)