Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeriodicThread accumulates error over time #2488

Closed
PeterBowman opened this issue Feb 5, 2021 · 12 comments · Fixed by #2515
Closed

PeriodicThread accumulates error over time #2488

PeterBowman opened this issue Feb 5, 2021 · 12 comments · Fixed by #2515
Labels
Affects: YARP v3.4.2 This is a known issue affecting YARP v3.4.2 Component: Library - YARP_os Type: Breaking/Behaviour Change Involves breaking user code or behaviour

Comments

@PeterBowman
Copy link
Member

PeterBowman commented Feb 5, 2021

Describe the bug
yarp::os::PeriodicThread wraps a callback method (run()) executed repeatedly in a separate thread given a fixed period. It accounts for the time elapsed while processing this callback so that the thread is slept until the next iteration without exceeding said period, i.e. sleepTime = fixedPeriod - elapsedTime. However, I have noticed a drift is introduced due to inaccuracies in estimating the sleep time.

To Reproduce
The following program instantiates a PeriodicThread with a fixed period of 50 ms. On each step, it prints the current iteration, current real time, and expected time.

cmake_minimum_required(VERSION 3.12 FATAL_ERROR)
project(yarp-timer LANGUAGES CXX)
find_package(YARP 3.4 REQUIRED COMPONENTS os)
add_executable(${PROJECT_NAME} main.cpp)
target_link_libraries(${PROJECT_NAME} YARP::YARP_os YARP::YARP_init)
#include <yarp/os/Log.h>
#include <yarp/os/Network.h>
#include <yarp/os/PeriodicThread.h>
#include <yarp/os/SystemClock.h>

struct Worker : public yarp::os::PeriodicThread
{
    Worker() : yarp::os::PeriodicThread(0.05, yarp::os::ShouldUseSystemClock::Yes) {}

    bool threadInit() override
    {
        initial = yarp::os::SystemClock::nowSystem();
        return true;
    }

    void run() override
    {
        auto real = yarp::os::SystemClock::nowSystem();
        auto expected = initial + getPeriod() * getIterations();
        auto diff = real - expected;
        yInfo("[%d] real: %.6f, expected: %.6f, diff: %.6f", getIterations(), real, expected, diff);
    }

    double initial;
};

int main()
{
    yarp::os::Network yarp(yarp::os::YARP_CLOCK_SYSTEM);

    Worker worker;
    worker.start();

    while (worker.isRunning()) yarp::os::SystemClock::delaySystem(0.1);
    return 0;
}

Output showing first few iterations, and ~1000:

[INFO] [0] real: 1612547702.942900, expected: 1612547702.942871, diff: 0.000029
[INFO] [1] real: 1612547702.993062, expected: 1612547702.992871, diff: 0.000190
[INFO] [2] real: 1612547703.043190, expected: 1612547703.042871, diff: 0.000319
...
[INFO] [999] real: 1612548315.224370, expected: 1612548315.077567, diff: 0.146802
[INFO] [1000] real: 1612548315.274484, expected: 1612548315.127567, diff: 0.146917
[INFO] [1001] real: 1612548315.324602, expected: 1612548315.177567, diff: 0.147035

As you can see, the timer has drifted away ~0.15 seconds, i.e. 3 times the period, over 1000 iterations (50 seconds). (edit: initial timestamps correspond to a previous run, sorry, in any case the results were consistent)

Expected behavior
Certain applications require strict timing using a fixed period, e.g. xPDO cyclic synchronous communication in CANopen standard. I expect that yarp::os::PeriodicThread (also used by yarp::os::Timer under the hood) calls run at exactly period*N_iter seconds from start, no matter how long it has been running. We've noticed this problem because a timer implemented by hand as shown below does the trick (for simple cases):

import yarp

yarp.Network.init()

start = yarp.now()
i = 1

while True:
    now = yarp.now()
    sinceStart = now - start
    print(i, now)
    yarp.delay(0.05 * i - sinceStart)
    i = i + 1

I am therefore proposing to set an initial static time reference from which to substract on each iteration.

Screenshots
In this plot two sequences are depicted: red is for a correct fixed period as in the above Python snippet, and blue for a faulty PeriodicThread. Y-values should come in blue-red pairs at constant offset (initial instants differ), but a drift in X-values (timestamps) is noticeable. Zoom in/out (double click) and pan the graph to check how blue points are shifted along the entire time domain.

Screenshot_2021-02-05 Online Graph Maker · Plotly Chart Studio

Configuration

  • OS: Ubuntu 20.04.2 (focal)
  • yarp version: 3.4.2+20-20210204.2+gitb966f9f29
  • compiler: gcc 9.3.0
@traversaro
Copy link
Member

cc @isorrentino this could explain the drift we saw today in velocity data

@traversaro
Copy link
Member

cc @GiulioRomualdi @prashanthr05 @S-Dafarra

@PeterBowman
Copy link
Member Author

Oops, the console output I pasted in the description must be mixing two different runs, if you compare start and end timestamps. In any case the total drift was indeed 0.15 s. I have just tested this out of curiosity on WSL 2 / Ubuntu focal / YARP 3.4.2+2-20210119.16+git444e855e9 and got a drift of 0.67 seconds in 1000 iterations (50 seconds total):

[INFO] [0] real: 1612560409.487951, expected: 1612560409.487911, diff: 0.000040
[INFO] [1] real: 1612560409.538794, expected: 1612560409.537911, diff: 0.000883
[INFO] [2] real: 1612560409.589008, expected: 1612560409.587911, diff: 0.001097
...
[INFO] [999] real: 1612560460.103411, expected: 1612560459.437911, diff: 0.665500
[INFO] [1000] real: 1612560460.154335, expected: 1612560459.487911, diff: 0.666425
[INFO] [1001] real: 1612560460.204613, expected: 1612560459.537911, diff: 0.666703

@PeterBowman
Copy link
Member Author

PeterBowman commented Feb 9, 2021

PeriodicThread can use either a system or a network clock for measuring elapsed time per step:

nowFunc(useSystemClock == ShouldUseSystemClock::Yes ? SystemClock::nowSystem : yarp::os::Time::now),

void step()
{
lock();
currentRun = nowFunc();
if (scheduleReset) {
_resetStat();
}
if (count > 0) {
double dT = (currentRun - previousRun);
sumTSq += dT * dT;
totalT += dT;
if (adaptedPeriod < 0) {
adaptedPeriod = 0;
}
estPIt++;
}
previousRun = currentRun;
unlock();
if (!suspended) {
owner->run();
}
// At the end of each run of updateModule function, the thread is supposed
// to be suspended and release CPU to other threads.
// Calling a yield here will help the threads to alternate in the execution.
// Note: call yield BEFORE computing elapsed time, so that any time spent due to
// yield is took into account and the sleep time is correct.
yield();
lock();
count++;
double elapsed = nowFunc() - currentRun;
//save last
totalUsed += elapsed;
sumUsedSq += elapsed * elapsed;
unlock();
sleepPeriod = adaptedPeriod - elapsed; // everything is in [seconds] except period, for it is used in the interface as [ms]
delayFunc(sleepPeriod);
}

Question: shouldn't nowFunc always use system clock? I'm dealing with a network clock which sends timestamps each 5 milliseconds for thread synchronization, therefore there is a PeriodicThread somewhere expecting to match this period. However, almost every second step is missed (i.e. I'm effectively seeing a period of 10 ms) since the delay nearly matches sync period: currentStamp - lastStamp == 0.005. I'd expect the internal delay to be adjusted according to the reality of this PeriodicThread, instead of depending on my external custom clock which is an artifice for synchronizing some other threads out in the wild.

@drdanz
Copy link
Member

drdanz commented Feb 10, 2021

@PeterBowman Thanks for the analysis and the graphs.
This is somehow expected from the fact that sleepTime = fixedPeriod - elapsedTime is used... I

I remember discussing this with someone a long time ago, but I've been told that this is the way it is supposed to be, because the PeriodicThread (probably RateThread at the time) should ensure that at least period seconds should have passed since the last execution.
Probably for some applications this makes sense, for other it does not, I honestly don't know that. What does people who use it think?

If there are actually 2 different use cases, it shouldn't be difficult to add an extra parameter to the constructor to switch between relative and absolute sleep time.

@pattacini
Copy link
Member

What does people who use it think?

I would very much like to use the other implementation, honestly.
In real-time control, the current implementation doesn't make sense.

@traversaro
Copy link
Member

A small clarification @PeterBowman : in original post #2488 (comment) were you using a NetworkClock or not? If not, how is #2488 (comment) is related to the original problem?

@drdanz
Copy link
Member

drdanz commented Feb 10, 2021

Question: shouldn't nowFunc always use system clock?

Sorry, I'm not sure to understand the reason why you would want this...
You can already choose to use the system clock (ShouldUseSystemClock::Yes) or to use the clock configured in the Network initialization (ShouldUseSystemClock::No), i.e. the system clock or the network (or custom) clock depending on the configuration
The first case is useful in case you want a thread that run at a real-world-time speed, the second is depending on whether you are using a real robot, or a simulation.

Using the system clock in a simulation, means that if the system clock is faster than the network clock, the thread would be resumed, it would check if enough time has passed (network time, not real time, which is not passed yet), and therefore go back to sleep for several times.
It also means that if the network clock is faster than the real one, it will be resumed too late.

Instead, the network clock is used, and (if I remember correctly) the network clock implementation checks which threads should be resumed whenever a new clock signal is received. Therefore this will adjust the actual sleep time properly, according to the simulation.

I'd expect the internal delay to be adjusted according to the reality of this PeriodicThread, instead of depending on my external custom clock which is an artifice for synchronizing some other threads out in the wild.

I'm not sure to understand this, are you using the network clock or a custom class that derives from yarp::os::Clock?

@drdanz drdanz added Affects: YARP v3.4.2 This is a known issue affecting YARP v3.4.2 Component: Library - YARP_os Type: Breaking/Behaviour Change Involves breaking user code or behaviour labels Feb 10, 2021
@PeterBowman
Copy link
Member Author

PeterBowman commented Feb 10, 2021

(...) the PeriodicThread (probably RateThread at the time) should ensure that at least period seconds should have passed since the last execution.

If there are actually 2 different use cases, it shouldn't be difficult to add an extra parameter to the constructor to switch between relative and absolute sleep time.

To my best understanding, the fact that the current implementation attempts to account for elapsed time tells me it was probably designed to ensure that exactly period seconds have passed, and that it stays so even after a longer while. Perhaps most applications are not affected by this drift, but the time-critical ones will definitely suffer inaccuracies. If PeriodicThread is able to comply with my real-time use case, it should also be valid for applications that don't care about drifts, therefore I think that having explicit support (via constructor) for those two scenarios would be overkill.

Real-time control aside, I find the PeriodicThread/Timer mechanisms especially appealing because they are a nicer alternative to a simple loop with a hardcoded delay inside, even if I don't actually need to place my repeatable code in a separate thread. It would be a pity if a hand-crafted for-loop is more accurate and a better fit than these two classes.

A small clarification @PeterBowman : in original post #2488 (comment) were you using a NetworkClock or not? If not, how is #2488 (comment) is related to the original problem?

@traversaro I'm sorry, my previous comment (network clock + elapsed time) is not directly related to the original description (drifts, using system clock). I have posted it in case their respective solutions conflict with each other and thus they should be considered as a whole.

@drdanz let me prepare a sample app to illustrate/double-check my point. Shortly, I believe the behavior you describe in this sentence is actually the opposite to my experience:

It also means that if the network clock is faster than the real one, it will be resumed too late.

@PeterBowman
Copy link
Member Author

PeterBowman commented Feb 10, 2021

In the following example I build two executables: the clock app which starts a network clock with a refresh rate of 5 milliseconds, and the reader app which iterates over a loop with the exact same period. Real use case and rationale: clock publishes a sync event over the network and an external controller (reader) catches up, acts accordingly and sends a setpoint back to the robot, awaiting the next synchronization (roboticslab-uc3m/yarp-devices#254).

CMakeLists.txt
cmake_minimum_required(VERSION 3.12 FATAL_ERROR)
project(yarp-clock LANGUAGES CXX)
find_package(YARP 3.4 REQUIRED COMPONENTS os)

add_executable(clock main-clock.cpp)
target_link_libraries(clock YARP::YARP_os YARP::YARP_init)

add_executable(reader main-reader.cpp)
target_link_libraries(reader YARP::YARP_os YARP::YARP_init)

set_target_properties(clock reader PROPERTIES COMPILE_DEFINITIONS CLOCK_PERIOD=0.005)
main-clock.cpp
#include <cmath>

#include <yarp/os/Bottle.h>
#include <yarp/os/BufferedPort.h>
#include <yarp/os/Network.h>
#include <yarp/os/SystemClock.h>

int main()
{
    yarp::os::Network yarp(yarp::os::YARP_CLOCK_SYSTEM);

    yarp::os::BufferedPort<yarp::os::Bottle> port;
    port.open("/myclock");

    auto count = 0;
    const auto initial = yarp::os::SystemClock::nowSystem();

    while (true)
    {
        // Publish current system time (network time for listeners).
        auto now = yarp::os::SystemClock::nowSystem();
        int secs = std::trunc(now);
        int nsecs = std::trunc((now - std::trunc(now)) * 1e9);
        port.prepare() = {yarp::os::Value(secs), yarp::os::Value(nsecs)};
        port.write();

        // Compute delay for current step. Use initial time as reference
        // and get current time again to account for elapsed time in this step.
        auto delay = initial + CLOCK_PERIOD * ++count - yarp::os::SystemClock::nowSystem();
        yarp::os::SystemClock::delaySystem(delay);
    }

    return 0;
}
main-reader.cpp
#include <yarp/os/LogStream.h>
#include <yarp/os/Network.h>
#include <yarp/os/SystemClock.h>
#include <yarp/os/Time.h>

int main()
{
    yarp::os::Network yarp;
    yarp::os::Time::useNetworkClock("/myclock");

    int count = 0;

    while (count++ < 2000)
    {
        auto now = yarp::os::SystemClock::nowSystem();
        yInfo("[%d] Network time: %f", count, yarp::os::Time::now());
        auto elapsed = yarp::os::SystemClock::nowSystem() - now;
        yarp::os::Time::delay(CLOCK_PERIOD - elapsed);
    }

    return 0;
}

The network clock caches the last timestamp received via "/myclock", so any call to Time::now() within the reader loop will give me the exact same value as long as I don't perform any CPU-intensive operation which takes more than 5 milliseconds. Regarding Time::delay, the thread in reader is blocked until the next timestamp reported by the network clock is currTime > prevTime + delay (ref). It doesn't actually matter when the delay function is called nor how great it is as long as it's less than the period of the network clock. The thread will be unblocked at the same instant regardless of using Time::delay(0.0049), Time::delay(0.004) or Time::delay(0.001). Likewise, I could place a SystemClock::delaySystem(0.001) or SystemClock::delaySystem(0.004) before the delay with the same result.

Now, please note I am accounting for real (system clock-driven) elapsed time in the delay, thus substracting a few microseconds from the clock period. Had I used Time::delay(CLOCK_PERIOD) instead (exactly 0.005 ms), this loop would have skipped every second step because of the currTime > prevTime + delay rendering false on non-consecutive checks. You need to substract a tiny bit from the delay, and here elapsed time plays a big role.

Since for-loops are ugly, I would mostly like to rewrite reader using a Timer or PeriodicThread. See:

main-reader.cpp (now with a PeriodicThread)
#include <yarp/os/LogStream.h>
#include <yarp/os/Network.h>
#include <yarp/os/PeriodicThread.h>
#include <yarp/os/SystemClock.h>
#include <yarp/os/Time.h>

class Worker : public yarp::os::PeriodicThread
{
public:
    Worker() : yarp::os::PeriodicThread(CLOCK_PERIOD) {}

protected:
    void run() override
    {
        yInfo("[%d] Network time: %f, period: %f, used: %f",
            getIterations() + 1, yarp::os::Time::now(), getEstimatedPeriod(), getEstimatedUsed());
    }
};

int main()
{
    yarp::os::Network yarp;
    yarp::os::Time::useNetworkClock("/myclock");

    Worker worker;
    worker.start();

    while (worker.getIterations() < 2000) yarp::os::SystemClock::delaySystem(0.1);

    worker.stop();
    return 0;
}

Output:

[INFO] [1] Network time: 1612991675.067073, period: 0.000000, used: 0.000000
[INFO] [2] Network time: 1612991675.077071, period: 0.009999, used: 0.000000
[INFO] [3] Network time: 1612991675.087079, period: 0.010003, used: 0.000000
[INFO] [4] Network time: 1612991675.097047, period: 0.009992, used: 0.000000
[INFO] [5] Network time: 1612991675.107070, period: 0.009999, used: 0.000000
[INFO] [6] Network time: 1612991675.112074, period: 0.009000, used: 0.000000
[INFO] [7] Network time: 1612991675.121995, period: 0.009154, used: 0.000000
[INFO] [8] Network time: 1612991675.127070, period: 0.008571, used: 0.000000
[INFO] [9] Network time: 1612991675.137071, period: 0.008750, used: 0.000000
...

Almost every second step is missed because of not accounting for real elapsed time: see how reported period is nearly 0.01 milliseconds instead. Used (elapsed) time is zero since internal calls to Time::now return the same timestamp, as explained earlier. Actual 5 ms steps (see [6] and [8]) are very uncommon and mostly due to casual rounding.

After applying the fix proposed in PeterBowman@77f41d3:

[INFO] [1] Network time: 1612992417.691294, period: 0.000000, used: 0.000000
[INFO] [2] Network time: 1612992417.696331, period: 0.001006, used: 0.000083
[INFO] [3] Network time: 1612992417.701334, period: 0.003012, used: 0.000077
[INFO] [4] Network time: 1612992417.706298, period: 0.003670, used: 0.000079
[INFO] [5] Network time: 1612992417.711333, period: 0.003998, used: 0.000076
[INFO] [6] Network time: 1612992417.716314, period: 0.004197, used: 0.000076
[INFO] [7] Network time: 1612992417.721334, period: 0.004340, used: 0.000075
[INFO] [8] Network time: 1612992417.726284, period: 0.004417, used: 0.000074
[INFO] [9] Network time: 1612992417.731332, period: 0.004498, used: 0.000072
...

Misses still may occur rarely (my real use case involves using a real timer instead of a loop over the network clock, and doing a bit more work in the reader thread so that elapsed time will be noticeable), but my point here is that elapsed time is making a difference. Besides, stats work again (getEstimatedUsed).

@drdanz
Copy link
Member

drdanz commented Feb 11, 2021

I'm not sure about this. The catch here is that the thread is not actually performing a delay, but it is blocked, waiting for a signal from the network clock thread. Whenever it receives that signal, the thread restarts without checking if the time is actually passed.

This is the theoretic working principles.
Let's assume that we have a network clock with period 1:

  • The local network clock thread (C) receives time X
  • PeriodicThread (P) starts and requests the time, the clock returns X
  • P ends its task and requests the time, the clock returns X [*]
  • P requests a delay of period (1) - (start (X) - now (X)) i.e. requests a delay of 1.
  • C receives the delay, checks current time (still X [*]), and stores that he should send a signal to the thread at X + 1
  • C receives time X + 1. It checks which threads are sleeping and signals them if necessary (i.e. if T - now < 1e-12). This means P is signaled, since (X - 1) - (X - 1) = 0 < 1e-12
  • P performs another step

[*] assuming that the clock did not receive a new time in the meanwhile which for a task that is shorter than the period should happen most of the times

If this does not happen, then there is probably a bug in the implementation, but the system time does not have any role here.

I think you are assuming that the system time and the network time can be replaced because your clock is running exactly at the system time, but you should consider that this is not the only use case. The network time can be faster or slower than the system time (which is totally expected in a simulation). Actually, this is a very specific use case since you also want the period to be exactly the same as the network clock, and you have a task that takes a very small time.

You should make some easy experiment, changing the period in the thread, adding a delay in your run() and adding a multiply factor to your clock, i.e. something like

        // Publish current system time (network time for listeners).
        static constexpr double realtimefactor = 0.5; // Or 2 for a clock that is faster than real time
        auto now = yarp::os::SystemClock::nowSystem() * realtimefactor;
        int secs = std::trunc(now);
        int nsecs = std::trunc((now - std::trunc(now)) * 1e9);
        port.prepare() = {yarp::os::Value(secs), yarp::os::Value(nsecs)};
        port.write();

For example, assuming realtimefactor = 0.5 and that the real world time is sampled at times 1, 2, 3, 4, 5, the clock will publish respectively 0.5, 1, 1.5, 2, 2.5.
If you request a period of "1" (2 times the network clock frequency), and assuming that the thread is running at system clock 2+t1 (= network clock 1), the thread will save

Network Clock System Clock
start 1 2+t1
end 1 2+t1+t2

Therefore if you are using a network clock you will request a delay of 1 - (1 - 1) = 1, i.e. the thread should be signalled by the network clock when "2" is received (as expected).
with the system clock you are requesting a delay of 1 - (2+t1 - 2+t1+t2) = 1 - t2. This is ok only if t2 <= 0.5, since a bigger value will make the thread request for a delay < 0.5, and the clock will wake the process when it receives 1.5 (and then at 2 at the next run, 2.5,...), causing the thread period to be 2 times the period requested.


TL;DR: I think that there might be be a bug in current implementation, but using the system clock here is wrong.

@PeterBowman
Copy link
Member Author

PeterBowman commented Feb 11, 2021

Thank you very much for taking your time to review this, @drdanz, I agree with you. I have realized my assumptions were incomplete in two fundamental ways:

  • The period of my local periodic thread is equal to the period (a.k.a. publication rate since network and clock times move at the same pace; see next point for a different scenario) of a remote network clock. This is an edge case. A resolution (if this term is applicable here) of 5 ms is insufficient to account for a meaningful elapsed time (negligible in comparison) unless it takes more than this established period to complete the task, which normally would never be the case. I was distracted by the fact that getEstimatedUsed was returning a zero value, thus considering it had been rendered pointless. That doesn't hold true if the network clock is sampled at much higher rates, thus updating the last timestamp queried by now() several times per step, then actually accounting for the elapsed time (with a tiny error due to resolution). In a similar vein, I was abusing of the fact that NetworkClock::delay() involves blocking and releasing a thread, which is the behavior I was looking for in my use case, but its actual purpose of delaying the execution for an expected amount of time was being neglected in my example because of the slow sampling rate.
  • Remote network clock and local system clock advance at the same pace. Now I see it that using a system clock doesn't make sense because of mixing two potentially different time units, especially in the simulated scenario. In this operation, now() and seconds must share the same magnitude. If seconds is expressed in a different scale, the addition is plainly wrong.

I am pursuing a specific functionality which the NetworkClock seemed to offer (blocking a thread until a remote signal arrives), but I have actually misused it as explained above. The easiest way for me to tackle this is through subscription+callback (e.g. via PortReader) to the remote port, which would no longer be treated as a network clock, but a synchronization signal instead, optionally associated to a timestamp. However, I feel this should be discussed elsewhere.

If this does not happen, then there is probably a bug in the implementation, but the system time does not have any role here.

The (X - 1) - (X - 1) = 0 < 1e-12 operation is subject to how precisely new timestamps (X) are acquired by successive iterations in the clock loop. I'm not surprised that the condition is false if network clock period and loop delay are exactly the same. My expectation was that elapsed time, if accounted for and substracted from the delay, would have made the operation return less than zero and therefore released the semaphore. This is an edge case, though, and I refrain now from doing it this way.

I brought up this question because I feel something around elapsed time should be done while fixing the original problem (drifts). In fact, I didn't need to account for it while estimating the delay that drives the loop in main-clock.cpp. So perhaps something like this might work (ref):

delayFunc(initialTime + period * count - nowFunc());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Affects: YARP v3.4.2 This is a known issue affecting YARP v3.4.2 Component: Library - YARP_os Type: Breaking/Behaviour Change Involves breaking user code or behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants