Lista para version 6.3

Linux 6.3 was released

Summary: This release includes improved Btrfs performance and fragmentation improvements, support for non-executable memfds, eBPF support for HID devices, support for IPv4 Big TCP, new ids to the rseq system call, support AMD QoS new features, specifications for the netlink protocol, more secure NFS encryption, and multi-actuator support in the BFQ I/O sched. As always, there are many other features, new drivers, improvements and fixes.

Btrfs performance and fragmentation improvements

This release places a number of heuristics in the block allocator to place files with different fragmentation characteristics together and separated from other types. This can help to avoid fragmentation in some cases, in particular this may help during balance.

There are also a few notable performance improvements: the send functionality now caches caches directory utimes and only emit the command when necessary, which can speed up send up to 10x. The fiemap ioctl can be now up to 3x faster when extents are shared, and there are some microoptimizations that can speed up file creation in synthetic benchmarks up to 10%.

eBPF support for HID devices, and other BPF enhancements

As usual, this release includes a number of enhancements, like a rbtree data structure that follows the recently-added linked-list. Because of previous limitation, BPF users had to use the BPF map structures (hash, array) for data structures that don't fit well in these structures. With the introduction of kfuncs, kptrs, and the any-context BPF allocator, it is now possible to implement this rbtree data structure that can expose red-black tree structures inside the kernel more naturally.

This release also includes a somewhat exotic use of BPF: using eBPF programs as a way to add small features and tweaks to existing Input HID drivers. For example, as joystick gets older, it is common to see it wobbling around its neutral point. This is usually filtered at the application level by adding a dead zone for this specific axis. With HID-BPF, it is possible to add a filter in the kernel directly so userspace does not get woken up when nothing else is happening on the input controller.

Another use would be to add a feature that requires a new kernel API, morph a device into something else and control that from userspace, or prevent users from accessing a feature (eg. broken firmware feature), or simply tracing HID events.

Recommended LWN article: https://lwn.net/Articles/909109/BPF for HID drivers

Non-executable memfds

memfds (which are file descriptors that just refer to an area of anonymous process memory, instead of a file system). This release adds some flags to allow disabling the executability of these files, and it's also possible to seal them.

Recommended LWN article: Enabling non-executable memfds

Support IPv4 big TCP (TSO frames larger than 64kB)

This is an IPv4 implementation of BIG TCP, which allow biggers TSO/GRO packet sizes for IPv4 traffic. Reducing number of packets traversing networking stack usually improves performance. This is similar to the IPv6 BIG TCP feature but for the v4 family.

Recommended LWN article: Going big with TCP packets

Add new ids to the rseq system call for faster and more efficient memory allocators

The rseq(2) system call (alias for "restartable sequences") was added in Linux 4.18

This release extends the rseq(2) system call to also expose other identification numbers that provide some heavy-lifting needed by eg. memory allocators to allow them to use per-cpu data structures more efficiently:

* NUMA node id: This allows to gather the NUMA node id more efficiently than getcpu(2), which allows memory allocators such as tcmalloc to take advantage of this fast access to perform NUMA-aware memory allocation. It can also be useful for implementing fast-paths for NUMA-aware user-space mutexes, and even allows implementing getcpu(2) purely in user-space.

* Per-memory map concurrency ID. This concurrency ID is within the possible cpus range, and is temporarily (and uniquely) assigned to a memory map while threads are actively running within it. If a memory map has fewer threads than cores, or is limited to run on few cores concurrently through sched affinity or cgroup cpusets, the concurrency IDs will be values close to 0, thus allowing efficient use of user-space memory for per-cpu data structures.

* NUMA-aware concurrency id: It is similar to the concurrency ID, except that it provides the NUMA node ids with which each concurrency id has been associated, and it is guaranteed to never change NUMA node unless a kernel-level NUMA configuration change happens. This makes possible to create per-cpu structures in environments where a process or a set of processes belonging to cpuset are pinned to a set of cores which belong to a subset of the system's NUMA nodes. In those situations, it is possible to benefit from the compactness of concurrency IDs over CPU ids, while keeping NUMA locality, for indexing a per-cpu data structure which takes into account NUMA locality.

Support AMD QoS new features

Support for AMD QoS new features: Slow Memory Bandwidth Allocation (SMBA) and Bandwidth Monitoring Event Configuration (BMEC). These extensions are intended to provide for the monitoring of the usage of certain system resources by one or more processors and for the separate allocation and enforcement of

use limits:

1. Slow Memory Bandwidth Allocation (SMBA): With this feature, the QOS enforcement policies can be applied to the external slow memory connected to the host. Currently, CXL.memory is the only supported "slow" memory device.

2. Bandwidth Monitoring Event Configuration (BMEC). The bandwidth monitoring events mbm_total_event and mbm_local_event are set to count all the total and local reads/writes respectively.

Official site: AMD64 Technology Platform Quality of Service Extensions

Netlink protocol specifications

The netlink protocol is a networking protocol used to communicate user space programs with the kernel. For example, it is used to configure and gather information about wireless devices. Adding new communication endpoints require manually adding them to userspace libraries. This release adds machine readable netlink protocol descriptions in YAML. The expectation is that the spec can be used to either dynamically translate between whatever types the high level language likes. Currently only genetlink is supported.

More secure NFS encryption

This release improve/harden the security provided by the Linux kernel's RPCSEC GSS Kerberos 5 mechanism (used by NFS). This release disables DES-based enctypes by default, provides a mechanism for disabling SHA1-based enctypes, and introduces two modern AES-SHA2-based enctypes that do not use deprecated crypto algorithms.

Multi-actuator support in the BFQ I/O scheduler

Some traditional hard drives have more than one arm. In order to optimize performance, the I/O scheduler must attempt to keep both arms busy. This release adds some support for such multi-actuator drives to the BFQ I/O scheduler.