Lista para version 5.7

Linux 5.7 was released

Summary: This release adds: support for the notion of Thermal Pressure, which lets the task scheduler to take better scheduling decisions in the face of CPU frequency changes; support for frequency invariant scheduler accounting on x86 CPUs, which makes x86 perform better with the schedutil governor; a new and better exFAT file system implementation; support for a x86 feature that allows to detect atomic operations that span cache lines; ARM Pointer Authentication support for kernel code, which helps to prevent security issues; support for spawning processes with clone3() into cgroups; write protection support in userfaultfd(), which is equivalent to (but faster than) using mprotect(2) and a SIGSEGV signal handler; and a BPF-based Linux Security Module which allows for more dynamic security auditing. As always, there are many other new drivers and improvements.

Thermal Pressure in the task scheduler

When a CPU is overheating, the thermal governor will usually cap the maximum CPU frequency. This, however, decreases the maximum available compute capacity of that CPU. If the task scheduler is not immediately aware of those frequency changes, it will take wrong scheduling decisions assuming that the CPU has greater computing capacity than it actually has. This release introduces the notion of Thermal Pressure, which makes the task scheduler more aware of frequency capping, and leads to better task placement among available cpus in event of overheating, which in turn leads to better performance numbers.

Recommended LWN article: Telling the scheduler about thermal pressure

Frequency invariant scheduler accounting on x86 CPUs

Suppose a CPU has two frequencies: 500 and 1000 MHz. When running a task that would consume 1/3rd of a CPU at 1000 MHz, it would appear to consume approximately 2/3rd when running at 500 MHz, giving the false impression this CPU is almost at capacity, even though it can go faster. Without frequency scale-invariance, tasks look larger just because the CPU is running slower. This makes the schedutil cpufreq governor -which uses scheduler-provided CPU utilization information as input for making its decisions- take wrong decisions and perform worse.

This release implements frequency invariant scheduler accounting on (some) x86 CPUs. This makes capacity estimates more precise and keeps tasks on the same CPU better in the face of dynamic voltage and frequency scaling. Because of the improved behavior, the intel_pstate driver defaults now to using the schedutil governor.

Recommended LWN article: Frequency-invariant utilization tracking for x86

New exFAT file system

Linux 5.4 added

Split lock detection

A split-lock occurs when an atomic CPU instruction operates on data that spans two cache lines. This is much slower than an atomic operation within a cache line, and it disrupts performance on other cores. This release adds support for a x86 features that allows to detect split locks. Using the {{{split_lock_detect}}} boot command line, it is possible to warn or even send SIGBUS to applications that make use of split locks.

Recommended LWN article: Developers split over split-lock detection

ARM Kernel Pointer Authentication support

Linux 5.0 added support

Recommended LWN article: ARM pointer authentication

userfaultfd() write protection support

This release adds to userfaultfd(2) -a system call added in Linux 4.3

Recommended LWN article: Write-protect for userfaultfd()

bpf-lsm: A BPF-based Linux Security Module

The current kernel infrastructure for providing telemetry (Audit, Perf etc.) is disjoint from access enforcement (i.e. LSMs). Augmenting the information provided by audit requires kernel changes to audit, its policy language and user-space components. Furthermore, building a MAC policy based on the newly added telemetry data requires changes to various LSMs and their respective policy languages. This release adds a new LSM allows BPF programs to be attached to LSM hooks, which facilitates a unified and dynamic (not requiring re-compilation of the kernel) audit and MAC policy.

Recommended LWN article: KRSI — the other BPF security module

clone3(): Allow spawning processes into cgroups

This release adds support in clone3(2) for creating a process in a different cgroup than its parent, which means that callers can limit and account processes and threads right from the moment they are spawned. A service manager can directly spawn new services into dedicated cgroups; a process can be directly created in a frozen cgroup and will be frozen as well; the initial accounting jitter experienced by process supervisors and daemons is eliminated; threaded applications or even thread implementations can choose to create a specific cgroup layout where each thread is spawned directly into a dedicated cgroup.

Recommended LWN article: Cloning into a control group

Improved perf cgroup profiling

In the past, perf could only profile tasks in a specific cgroup and there was no way to know to which cgroup the current sample belonged to. In this release, perf incorporates cgroup information into each sample, which makes possible to profile more than one cgroup and used a cgroup sort key in perf report