Configuring Photon Real-Time Operating System for Real-Time Applications
Photon Real-Time (RT) Operating System (OS) (and the Linux kernel
PREEMPT_RT patchset that it is based on) is optimized to support low-latency real-time scheduling and minimize the OS jitter as observed by real-time applications. However, to get the most out of Photon RT OS, it is must to have a proper system configuration. To run low-latency real time applications effectively, the sources of jitter have to be identified and eliminated across all layers of the underlying system, spanning the BIOS / firmware, the hypervisor, and the guest operating system (Photon RT).
Tuning a system for real time operation starts from the lowest layers of the software stack, namely the System BIOS or Platform Firmware. The goal is to configure the settings for the following functions:
Maximize Performance Ex: Set CPU, memory and device power management modes to maximum performance, disable CPU idle states
Minimize Computational Jitter Ex: Disable Turbo Boost, disable Hyper-Threading
Minimize System Management Interrupts Ex: Disable options such as Processor Power and Utilization Monitoring, memory Pre-Failure Notification, and so on
Platform vendors often publish low-latency tuning guides for their BIOS/firmware. Refer documentation to learn about the recommended low-latency settings specific to your platform.
Deploying Real-Time Applications on Photon Real-Time Operating System
A general strategy to deploy real-time applications on Photon RT is described as follows:
Partition CPUs between the OS and the RT workload: Among the available CPUs in the system, isolate a subset of CPUs, designated to run the RT workload. By default, the Linux scheduler will only run tasks on non-isolated CPUs, leaving the isolated CPUs to those tasks that are explicitly bound to them. Thus, all the housekeeping tasks of the OS will execute on non-isolated CPUs (with a few exceptions, such as per-CPU kernel threads). Then bind the RT workload to the isolated CPUs.
Steer unrelated interrupts away from the CPUs running the RT workload: Linux supports the ability to affine most interrupts to specific CPUs in the system. By using this mechanism, interrupts that are not relevant to the real-time workload can be affined to non-isolated CPUs, thus avoiding the jitter caused by interrupt handling latency on the isolated CPUs.
This strategy provides two important benefits:
It limits OS interference with the RT workload.
It protects the OS services from getting starved by the CPU-intensive RT tasks.
This configuration can be achieved using a combination of kernel command-line options, and user space packages, as discussed in the following sections.
Kernel Command-Line Parameters
isolcpus=X,Y-Z (Ex: isolcpus=2,4-5)
irqaffinity=X,Y-Z (Ex: irqaffinity=0-1,3)[ Usually it is the inverse of
rcu_nocbs=X,Y-Z [ Usually it is same as
NOHZ (Eliminating the periodic timer)
nohz_full=X,Y-Z[ Usually it is same as
idle=halt or idle=poll
nosoftlockup nowatchdog nmi_watchdog=0
Timer skew detection
skew_tick=1 clocksource=tsc tsc=reliable
The full list of kernel command-line parameters and their descriptions are available at https://www.kernel.org/doc/html/v6.1/admin-guide/kernel-parameters.html
Tuned is a system tuning daemon that offers several profiles to tailor the OS to various usecases, including a ‘realtime’ profile for low-latency workloads.
The realtime tuned profile can be applied as shown below:
tdnf install tuned
systemctl enable tuned
systemctl start tuned
/etc/tuned/realtime-variables.conf (by uncommenting the isolated_cores= parameter):
$ cat /etc/tuned/realtime-variables.conf
Note: The cores configured as isolated in tuned should be consistent with isolcpus in the kernel command-line.
tuned-adm profile realtime
The stalld daemon monitors the system for starved tasks and revives them by giving them a temporary boost using the
stalld offers fine-grained controls to give starved tasks a user-specified amount of CPU time.
The stalld configuration file is
The key parameters are Starving Threshold (THRESH), Boost Period (BP), Boost Runtime (BR), and Boost Duration (BD).
The mode of operation is as follows:
If a task is starved for at least
THRESH seconds, it is scheduled using
SCHED_DEADLINE scheduling policy, so that it will run at least
BR nanoseconds in every
BP nanoseconds time period, and this repeats up to
BD seconds, after which the task gets back its original scheduler policy/priority settings.
Real Time Scheduling Policies
The Linux kernel offers several scheduling policies to support various applications, among which the real time policies are highlighted below:
SCHED_OTHER (default policy), SCHED_BATCH, SCHED_IDLE (non real-time policies)
SCHED_FIFO (First-In First-Out Real Time Scheduling)
Priority Range: 1 to 99 (highest)
Algorithm: The scheduler runs the highest-priority runnable task in the SCHED_FIFO scheduling class, until it yields (blocks/waits) the CPU voluntarily.
- SCHED_RR (Round-Robin Real Time Scheduling)
Priority Range: 1 to 99 (highest)
Algorithm: The scheduler runs the highest-priority SCHED_RR task, and time-slices between equal-priority SCHED_RR tasks in configurable intervals.
- SCHED_DEADLINE ( Earliest Deadline First Real Time Scheduling)
Key parameters: Runtime, Period and Deadline, which can be configured on a per-task basis.
Algorithm: The scheduler gives a SCHED_DEADLINE task at least
Runtimeamount of time on the CPU in every
Periodtime period, before
Deadlinetime is up.
Real Time Throttling
The Linux kernel offers proc file system (procfs) controls to influence real-time task scheduling and throttling.
The RT throttling algorithm is as follows:
All real-time tasks are throttled to run up to
runtimemicroseconds, in every
periodmicroseconds. The remaining time in
periodmicroseconds is used to run non-RT tasks in the system.
periodvalues can be configured by writing to the files listed as follows:
/proc/sys/kernel/sched_rt_runtime_usDefault: 95% (950000) Range: -1 to (INT_MAX -1) [ -1 implies no limit, i.e., no throttling ]
/proc/sys/kernel/sched_rt_period_usDefault: 1s (1000000) Range: 1 to INT_MAX
Note: See Command Line Reference for the commands for manipulating real-time properties of processes.