Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mmap on MARF connections #2900

Merged
merged 25 commits into from
Oct 27, 2021
Merged

Use mmap on MARF connections #2900

merged 25 commits into from
Oct 27, 2021

Conversation

jcnelson
Copy link
Member

@jcnelson jcnelson commented Oct 26, 2021

This PR against develop activates the mmap pragma on all sqlite connections. In support of this, this PR also vendors the
ctrlc crate we had been using, and extends it to handle a SIGBUS signal.

Sqlite may trigger SIGBUS signals if the underlying database file is mmap'ed and becomes unavailable at runtime (e.g. suppose it's on a network drive and the network goes down). SIGBUS is only triggered on an attempt to read the unavailable file; an attempt to write to an invalid address that was mapped will (correctly) trigger a SIGSEGV and lead to a crash. This PR makes it so that the node treats SIGBUS like SIGTERM, SIGINT, and SIGHUP -- it triggers a graceful shutdown.

I'm open to making the node simply crash with a panic as well. In fact, I think that it would be preferable if the process synchronously terminated on SIGBUS. But, I'd like confirmation that this is desired before making this happen (since it can lead to chainstate corruption).

I chose to vendor ctrlc because (a) it's a pretty stable crate at this point -- it's only been receiving PRs to update dependency versions -- and (b) it's very simple, especially compared to alternative signal handler crates, and it already does 99% of what we need.

@gregorycoppola
Copy link
Contributor

Hey Jude.. Thanks for a fast change.

How are you testing this?

src/util/db.rs Outdated Show resolved Hide resolved
@kantai kantai linked an issue Oct 26, 2021 that may be closed by this pull request
@jcnelson
Copy link
Member Author

jcnelson commented Oct 26, 2021

So, there is one thing that warrants further investigation here -- a thread that triggers SIGBUS needs to be terminated immediately. Not at a rendezvous or cancellation point, but at the CPU instruction which caused SIGBUS. This is because SIGBUS is triggered by an unaligned memory access, or in mmap(2)'s case, an attempt to read from a file-backed page whose underlying file data is absent. This can arise if you truncate the file while it's mmap(2)ed, or if the file is deleted, or rendered unavailable through a network partition. In all cases, it's Bad News Bears -- something is irrevocably corrupt. The only point of trying to exit gracefully is to avoid corrupting the other databases, so we can do a postmortem.

The reason we have to be this strict is because of how SIGBUS works. When the hardware raises the page fault to the kernel, the kernel will suspend the executing task at the offending CPU instruction, set up and run the signal handler right then and there (i.e. SIGBUS is handled synchronously in the program execution), and on signal handler return, attempt to re-run the offending CPU instruction. So if the thread's execution isn't terminated immediately, we'd be setting ourselves up for an infinite loop -- the offending thread will re-attempt to load data from an unbacked page, triggering another SIGBUS, causing the signal handler to run and exit, causing the same offending instruction to be re-run, over and over forever. The offending thread will never reach a cancellation point, nor will an attempt to join with it work. So, while the other threads will gracefully terminate, the offending thread never will.

There are a couple ways we can address this:

  • We can represent the halting state of the threads in some kind of async-safe global that the signal handler can inspect, so that once the other non-SIGBUS-triggering threads have all died, the signal handler can abort(2).

  • We can keep track of the file handles to all sqlite databases, and on SIGBUS, we execve(2) the same process image but with an environment variable set that causes main() to gracefully close the sqlite databases before halting.

Let me know if not crashing-and-burning is still something we want to do here.

@kantai
Copy link
Member

kantai commented Oct 26, 2021

Let me know if not crashing-and-burning is still something we want to do here.

I think crash-and-burn is the preferred behavior here. If possible, though, the node operator should be able to figure out that there was a system I/O error that led to the crash.

In terms of testing, I think this PR needs two things:

  1. Some amount of testing of the signal handlers -- the vendoring of ctrlc is fine, but we need to test it, both the original tests, and the modifications.
  2. It looks like the vendoring of ctrlc broke the libclarity build. That needs to be fixed.

@gregorycoppola
Copy link
Contributor

Hey @jcnelson , I see I was re-added for review, but my question last time was about the testing plan.

Are you able to recreate these problem cases by interacting with the server? Or in tests?

@gregorycoppola
Copy link
Contributor

gregorycoppola commented Oct 26, 2021

Oh whoops.. I thought I was re-added but I guess GitHub just leaves me with the "yellow dot" status if I just add comments at the bottom. :/ Never mind my last comment.

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to try this again (sorry for the spam)..

I'm wondering how we are going to test this.

@jcnelson
Copy link
Member Author

Some amount of testing of the signal handlers -- the vendoring of ctrlc is fine, but we need to test it, both the original tests, and the modifications. @kantai

I'm unaware of a safe way to test signal handlers that doesn't also break the test runner, but I'll try. I could make it so that there's test-specific paths in the signal handler that causes it to just set a global somewhere instead of crashing the process, but that doesn't really test the "crash-and-burn" property.

@jcnelson
Copy link
Member Author

Okay, I updated the tests to run the original ctrlc tests. But, we can only set the signal handler once in the test runner's execution lifetime, so I think we should just leave it at that. I've manually tested that the node will print out what kind of signal caused it to die.

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Before merging, can you add an entry to the CHANGELOG.md?

@jcnelson
Copy link
Member Author

Thanks; added.

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the change!

@jcnelson jcnelson merged commit 75f6328 into develop Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use MMAP for SQLite MARF connections
3 participants