Understanding real-time programming

背景

Real-time computing is a key feature of many robotics systems, particularly safety- and mission-critical applications such as autonomous vehicles, spacecrafts, and industrial manufacturing. We are designing and prototyping ROS 2 with real-time performance constraints in mind, since this is a requirement that was not considered in the early stages of ROS 1 and it is now intractable to refactor ROS 1 to be real-time friendly.

This document outlines the requirements of real-time computing and best practices for software engineers. In short:

To make a real-time computer system, our real-time loop must update periodically to meet deadlines. We can only tolerate a small margin of error on these deadlines (our maximum allowable jitter). To do this, we must avoid nondeterministic operations in the execution path, things like: pagefault events, dynamic memory allocation/deallocation, and synchronization primitives that block indefinitely.

A classic example of a controls problem commonly solved by real-time computing is balancing an inverted pendulum. If the controller blocked for an unexpectedly long amount of time, the pendulum would fall down or go unstable. But if the controller reliably updates at a rate faster than the motor controlling the pendulum can operate, the pendulum will successfully adapt react to sensor data to balance the pendulum.

Now that you know everything about real-time computing, let’s try a demo!

Install and run the demo

The real-time demo was written with Linux operating systems in mind, since many members of the ROS community doing real-time computing use Xenomai or RT_PREEMPT as their real-time solutions. Since many of the operations done in the demo to optimize performance are OS-specific, the demo only builds and runs on Linux systems. So, if you are an OSX or Windows user, don’t try this part!

Also this must be built from source using a static DDS API. Currently the only supported implementation is Connext.

First, follow the instructions to build ROS 2 from source using Connext DDS as the middleware.

Run the tests

Before you run make sure you have at least 8Gb of RAM free. With the memory locking, swap will not work anymore.

Source your ROS 2 setup.bash.

Run the demo binary, and redirect the output. You may want to use sudo in case you get permission error:

pendulum_demo > output.txt

What the heck just happened?

First, even though you redirected stdout, you will see some output to the console (from stderr):

mlockall failed: Cannot allocate memory
Couldn't lock all cached virtual memory.
Pagefaults from reading pages not yet mapped into RAM will be recorded.

After the initialization stage of the demo program, it will attempt to lock all cached memory into RAM and prevent future dynamic memory allocations using mlockall. This is to prevent pagefaults from loading lots of new memory into RAM. (See the realtime design article for more information.)

The demo will continue on as usual when this occurs. At the bottom of the output.txt file generated by the demo, you’ll see the number of pagefaults encountered during execution:

rttest statistics:
  - Minor pagefaults: 20
  - Major pagefaults: 0

If we want those pagefaults to go away, we’ll have to…

Adjust permissions for memory locking

Add to /etc/security/limits.conf (as sudo):

<your username>    -   memlock   <limit in kB>

A limit of -1 is unlimited. If you choose this, you may need to accompany it with ulimit -l unlimited after editing the file.

After saving the file, log out and log back in. Then rerun the pendulum_demo invocation.

You’ll either see zero pagefaults in your output file, or an error saying that a bad_alloc exception was caught. If this happened, you didn’t have enough free memory available to lock the memory allocated for the process into RAM. You’ll need to install more RAM in your computer to see zero pagefaults!

Output overview

To see more output, we have to run the pendulum_logger node.

In one shell with your install/setup.bash sourced, invoke:

pendulum_logger

You should see the output message:

Logger node initialized.

In another shell with setup.bash sourced, invoke pendulum_demo again.

As soon as this executable starts, you should see the other shell constantly printing output:

Commanded motor angle: 1.570796
Actual motor angle: 1.570796
Mean latency: 210144.000000 ns
Min latency: 4805 ns
Max latency: 578137 ns
Minor pagefaults during execution: 0
Major pagefaults during execution: 0

The demo is controlling a very simple inverted pendulum simulation. The pendulum simulation calculates its position in its own thread. A ROS node simulates a motor encoder sensor for the pendulum and publishes its position. Another ROS node acts as a simple PID controller and calculates the next command message.

The logger node periodically prints out the pendulum’s state and the runtime performance statistics of the demo during its execution phase.

After the pendulum_demo is finished, you’ll have to CTRL-C out of the logger node to exit.

Latency

At the pendulum_demo execution, you’ll see the final statistics collected for the demo:

rttest statistics:
  - Minor pagefaults: 0
  - Major pagefaults: 0
  Latency (time after deadline was missed):
    - Min: 3354 ns
    - Max: 2752187 ns
    - Mean: 19871.8 ns
    - Standard deviation: 1.35819e+08

PendulumMotor received 985 messages
PendulumController received 987 messages

The latency fields show you the minimum, maximum, and average latency of the update loop in nanoseconds. Here, latency means the amount of time after the update was expected to occur.

The requirements of a real-time system depend on the application, but let’s say in this demo we have a 1kHz (1 millisecond) update loop, and we’re aiming for a maximum allowable latency of 5% of our update period.

So, our average latency was really good in this run, but the maximum latency was unacceptable because it actually exceeded our update loop! What happened?

We may be suffering from a non-deterministic scheduler. If you’re running a vanilla Linux system and you don’t have the RT_PREEMPT kernel installed, you probably won’t be able to meet the real-time goal we set for ourselves, because the Linux scheduler won’t allow you to arbitrarily pre-empt threads at the user level.

See the realtime design article for more information.

The demo attempts to set the scheduler and thread priority of the demo to be suitable for real-time performance. If this operation failed, you’ll see an error message: “Couldn’t set scheduling priority and policy: Operation not permitted”. You can get slightly better performance by following the instructions in the next section:

Setting permissions for the scheduler

Add to /etc/security/limits.conf (as sudo):

<your username>    -   rtprio   98

The range of the rtprio (real-time priority) field is 0-99. However, do NOT set the limit to 99 because then your processes could interfere with important system processes that run at the top priority (e.g. watchdog). This demo will attempt to run the control loop at priority 98.

Plotting results

You can plot the latency and pagefault statistics that are collected in this demo after the demo runs.

Because the code has been instrumented with rttest, there are useful command line arguments available:

Command

Description

Default value

-i

Specify how many iterations to run the real-time loop

1000

-u

Specify the update period with the default unit being microseconds.

Use the suffix “s” for seconds, “ms” for milliseconds,

“us” for microseconds, and “ns” for nanoseconds.

1ms

-f

Specify the name of the file for writing the collected data.

Run the demo again with a filename to save results:

pendulum_demo -f pendulum_demo_results

Then run the rttest_plot script on the resulting file:

rttest_plot pendulum_demo_results

This script will produce three files:

pendulum_demo_results_plot_latency.svg
pendulum_demo_results_plot_majflts.svg
pendulum_demo_results_plot_minflts.svg

You can view these plots in an image viewer of your choice.