-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure flush callback gets called in move-assign operator #3232
Conversation
When using async with discard policy, slots get overrun using move assignments when the queue is full rather than destructed.. So, like I mentioned, the dtor won’t be called, and the loop may be stuck forever. |
That's not how move assignment/constructors work. To be able to move at all, one needs two objects: the moved-from and moved-to. The move assignment/constructor then (usually) moves variables between members, but doesn't change the fact that there are two objects and thus two destructor calls. Please take a look at this godbolt example, which showcases how many destructor calls happen with move assignments. Perhaps you're confusing moves with placement new? Placement new does indeed not call destructors, but as far as I can see, spdlog does not use placement new. |
I meant that the destruction of the items in the queue aren't called when overrun happens, so I guess this is the purpose of the swap in this PR - to call the callback of the overrun item in the destructor inside Anyway, seems to get stuck. See the following example. After few runs it get stuck forever as one of the threads won't join: #include <chrono>
#include "spdlog/spdlog.h"
#include "spdlog/async.h"
#include "spdlog/../../tests/test_sink.h"
int main(int, char *[]) {
const size_t queue_size = 100;
const int n_threads = 8;
auto thread_pool = std::make_shared<spdlog::details::thread_pool>(queue_size, 1);
auto test_sink = std::make_shared<spdlog::sinks::test_sink_mt>();
test_sink->set_delay(std::chrono::milliseconds(2));
test_sink->set_level(spdlog::level::trace);
auto logger = std::make_shared<spdlog::async_logger>(
"as", test_sink, thread_pool, spdlog::async_overflow_policy::overrun_oldest);
logger->set_level(spdlog::level::trace);
std::vector<std::thread> threads;
std::atomic<int> started_threads(0);;
std::atomic<int> done_threads(0);;
for (int i = 0; i < n_threads; i++) {
threads.emplace_back([&] {
started_threads++;
fmt::print("Thread {}/{} started\n", started_threads.load(), n_threads);
for(int iter=0; iter < 10; iter++) {
for (int j = 1; j < queue_size ; ++j) {
logger->info("Async message #{}", j);
}
logger->flush();
}
printf("Thread %d/%d done\n", ++done_threads, n_threads);
});
};
// join the threads
for (auto &t : threads) {
t.join();
}
printf("Done all threads\n");
} |
Ah gotcha. Yes, that is why the swap is needed.
I can reproduce the getting stuck locally. I also get the stuck behaviour on commit ee16895, the one before my changes were merged. I'm not surprised, as As for why, I'm not entirely sure. When running the thread sanitizer, I get data races on the
I notice that
|
I merged since it get stuck even the previous commits. I think I will change the flush behavior to be async again (as it was versions of spdlog prior to 1.14.0). This is getting too complicated and unstable. Perhaps just wait for queue to be empty before and after sending the flush request. Most users are just interested in that anyway - that all items in the the queue got processed after flush returns |
Yeah, I was thinking about proposing just removing the synchronous behaviour, but I don't know the use cases. Another option is to provide message id's that can be polled by users. Just a monotonically increasing number that almost never wraps and the highest processed id can be requested from the logger or thread pool somehow, |
Reverted everything (#3235) to make flush fully async again. |
While this change makes the condition_variable more reliable, without the 100 ms loop results in spdlog-utests getting stuck sometimes. With the
wait_for
loop, the multithreaded flush test, which inserts 10240 messages of which most get overridden, continues to work.