Add support for SQLite (was: Are there plans to support sqlite?) #28

liangyongrui · 2020-01-04T13:14:32Z

No description provided.

mehcode · 2020-01-07T08:19:26Z

Definitely. When we say the "Rust SQL Toolkit" our vision is to support all of the SQL databases at some point in the future.

I would say SQLite is next on our list to add.

abonander · 2020-01-07T08:43:12Z

This should stay open until SQLite support is merged.

abonander · 2020-01-07T08:51:39Z

So our idea for making SQLite async is to operate it in WAL (write-ahead log) mode and run the blocking fsync() behind task::blocking().

emilazy · 2020-02-12T17:41:09Z

I think that running SQLite on another thread would probably be a better idea than futzing with the internals of its storage layer to get it to run asynchronously on the main thread; the overhead of one OS thread and the necessary communication should be very small compared to the network communication involved with the other backends, and messing with fsync calls runs the risk of breaking SQLite's carefully-honed data integrity guarantees.

mehcode · 2020-02-12T17:54:15Z

@emilazy It's more an experiment. If it doesn't pan out, we can totally do that. Keep in mind that this is exactly what SQLite recommends you to do though.

NOTE: WAL mode with PRAGMA synchronous set to NORMAL avoids calls to fsync() during transaction commit and only invokes fsync() during a checkpoint operation. The use of WAL mode largely obviates the need for this asynchronous I/O module. Hence, this module is no longer supported. The source code continues to exist in the SQLite source tree, but it is not a part of any standard build and is no longer maintained. This documentation is retained for historical reference.

https://www.sqlite.org/asyncvfs.html

emilazy · 2020-02-12T18:04:23Z

That same document also says:

You lose the Durable property. With the default I/O backend of SQLite, once a write completes, you know that the information you wrote is safely on disk. With the asynchronous I/O, this is not the case. If your program crashes or if a power loss occurs after the database write but before the asynchronous write thread has completed, then the database change might never make it to disk and the next user of the database might not see your change.

I'm not sure if that also extends to using WAL mode in this way – if it does, then I think at the very least there should be a warning about potentially compromised data integrity, since it might only observably fail to pan out after people lose data ^^;

mehcode · 2020-02-12T19:46:33Z

I think it's pretty clear that the rest of the document beyond the heading is defunct.

This documentation is retained for historical reference.

To assuage any concerns, the first iteration of support will most likely "just" use spawn_blocking or equivalent around potentially blocking calls. Next, we'll attempt to mimic PRAGMA synchronous=FULL which places an fsync after each commit. It wouldn't be too hard to support both the blocking and non-blocking styles as there is a very small amount of code we're talking about here.

Keep in mind, this may not even be worth it. I don't have any hard numbers yet to say "yes, doing this is a large enough performance increase".

cljoly · 2020-02-27T12:08:44Z

So our idea for making SQLite async is to operate it in WAL (write-ahead log) mode and run the blocking fsync() behind task::blocking().

If I understand correctly, your main concern is with syncing.

According to the WAL documentation:

Note that with PRAGMA synchronous set to NORMAL, the checkpoint is the only operation to issue an I/O barrier or sync operation

Then, why not operate in WAL mode and:

set PRAGMA synchronous to NORMAL
disable auto-checkpoint
run checkpoint from the application, for instance in the task::blocking() you mention?

This way, no fsync appens outside of sqlx control. And it looks like there is good support for this kind of modification. From the same page:

Applications using WAL do not have to do anything in order to for these checkpoints to occur. But if they want to, […] they can turn off the automatic checkpoints and run checkpoints during idle moments or in a separate thread or process

To implement the run checkpoint from the application, this section recommends to use PASSIVE checkpoint mode. As outlined, this would be as “easy” as registering a hook triggered after each commit, so that sqlx decide whether checkpointing is required and if so, manually call, in task::blocking(), the function sqlite uses automatically when auto-checkpointing is enabled.

I hope my explanations are clear.

mamcx · 2020-03-03T20:56:59Z

Nope, this is treat sqlite as nosql. Data integrity is paramount in rdbms. As a user, I must NOT need to find that the sqlite driver do any kind of hack.

cljoly · 2020-03-03T21:34:13Z

If I understand the documentation correctly, the only thing that threatens data integrity would be setting PRAGMA synchronous to NORMAL. This would not even corrupt the database, but it is true that transaction could be reversed.
Do you @mamcx think the rest of what I outline threatens integrity?

abonander · 2020-03-03T21:54:16Z

For our initial implementation we were going to checkpoint after every query execution, so data can only be lost from the current operation before control is returned to the application, which is already a normal concern with SQLite.

Honestly, if you want perfect durability you should not be using SQLite anyway.

gortiz · 2020-03-04T07:13:57Z

Why don't you let the user to decide when to checkout and vacuum, which mode he/she wants to use, etc?

abonander · 2020-03-04T07:32:13Z

I fully agree, but that's something that can be discussed after the initial implementation is done.

mehcode · 2020-03-15T02:56:34Z

Support for SQLite is here:

#129

After experimenting with various methods, I've decided to execute all blocking operations on a worker thread. spawn_blocking or similar APIs are unacceptable for our situation because while a SQLite connection can be sent between threads, a single SQL program must be iterated/stepped continuously on the same thread. The iterative nature of SQLx means that this is impossible to guarantee without having our own worker thread.

Most of the SQLite API can be ran on the "current" thread. In effect, it's just the sqlite3_step API that gets pushed to the worker thread ( this is called when iterating rows on a query, for example ).

Initial testing seems to hold up fine even with high concurrency. I'd love assistance with additional testing. It's definitely possible I'm missing something.

@abonander I'm going to need your help for making sure the macro support for SQLite is done correctly. I added what I think should make it work.

We probably should extend test_type! to also generate tests that use the macros.

* Add Docker build CI to pull requests * Add SQLite document persistence through SQLx * Update README to describe configuration variables * Minor changes to README wording * Update image size estimate listed in README * Update frontend dependencies * Add direct database tests and restructure code * Clarify use of `SQLITE_URI` in Docker contexts

mehcode added enhancement New feature or request question Further information is requested and removed question Further information is requested labels Jan 7, 2020

liangyongrui closed this as completed Jan 7, 2020

abonander reopened this Jan 7, 2020

mehcode changed the title ~~Are there plans to support sqlite?~~ Add support for SQLite (was: Are there plans to support sqlite?) Jan 7, 2020

mehcode mentioned this issue Mar 15, 2020

Add support for SQLite #129

Merged

mehcode closed this as completed in #129 Mar 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for SQLite (was: Are there plans to support sqlite?) #28

Add support for SQLite (was: Are there plans to support sqlite?) #28

liangyongrui commented Jan 4, 2020

mehcode commented Jan 7, 2020

abonander commented Jan 7, 2020

abonander commented Jan 7, 2020

emilazy commented Feb 12, 2020

mehcode commented Feb 12, 2020

emilazy commented Feb 12, 2020

mehcode commented Feb 12, 2020

cljoly commented Feb 27, 2020 •

edited

Loading

mamcx commented Mar 3, 2020

cljoly commented Mar 3, 2020

abonander commented Mar 3, 2020 •

edited

Loading

gortiz commented Mar 4, 2020

abonander commented Mar 4, 2020

mehcode commented Mar 15, 2020

Add support for SQLite (was: Are there plans to support sqlite?) #28

Add support for SQLite (was: Are there plans to support sqlite?) #28

Comments

liangyongrui commented Jan 4, 2020

mehcode commented Jan 7, 2020

abonander commented Jan 7, 2020

abonander commented Jan 7, 2020

emilazy commented Feb 12, 2020

mehcode commented Feb 12, 2020

emilazy commented Feb 12, 2020

mehcode commented Feb 12, 2020

cljoly commented Feb 27, 2020 • edited Loading

mamcx commented Mar 3, 2020

cljoly commented Mar 3, 2020

abonander commented Mar 3, 2020 • edited Loading

gortiz commented Mar 4, 2020

abonander commented Mar 4, 2020

mehcode commented Mar 15, 2020

cljoly commented Feb 27, 2020 •

edited

Loading

abonander commented Mar 3, 2020 •

edited

Loading