Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SQLite (was: Are there plans to support sqlite?) #28

Closed
liangyongrui opened this issue Jan 4, 2020 · 14 comments · Fixed by #129
Closed

Add support for SQLite (was: Are there plans to support sqlite?) #28

liangyongrui opened this issue Jan 4, 2020 · 14 comments · Fixed by #129
Labels
enhancement New feature or request

Comments

@liangyongrui
Copy link

No description provided.

@mehcode mehcode added enhancement New feature or request question Further information is requested and removed question Further information is requested labels Jan 7, 2020
@mehcode
Copy link
Member

mehcode commented Jan 7, 2020

Definitely. When we say the "Rust SQL Toolkit" our vision is to support all of the SQL databases at some point in the future.


I would say SQLite is next on our list to add.

@abonander abonander reopened this Jan 7, 2020
@abonander
Copy link
Collaborator

This should stay open until SQLite support is merged.

@mehcode mehcode changed the title Are there plans to support sqlite? Add support for SQLite (was: Are there plans to support sqlite?) Jan 7, 2020
@abonander
Copy link
Collaborator

So our idea for making SQLite async is to operate it in WAL (write-ahead log) mode and run the blocking fsync() behind task::blocking().

@emilazy
Copy link

emilazy commented Feb 12, 2020

I think that running SQLite on another thread would probably be a better idea than futzing with the internals of its storage layer to get it to run asynchronously on the main thread; the overhead of one OS thread and the necessary communication should be very small compared to the network communication involved with the other backends, and messing with fsync calls runs the risk of breaking SQLite's carefully-honed data integrity guarantees.

@mehcode
Copy link
Member

mehcode commented Feb 12, 2020

@emilazy It's more an experiment. If it doesn't pan out, we can totally do that. Keep in mind that this is exactly what SQLite recommends you to do though.

NOTE: WAL mode with PRAGMA synchronous set to NORMAL avoids calls to fsync() during transaction commit and only invokes fsync() during a checkpoint operation. The use of WAL mode largely obviates the need for this asynchronous I/O module. Hence, this module is no longer supported. The source code continues to exist in the SQLite source tree, but it is not a part of any standard build and is no longer maintained. This documentation is retained for historical reference.

https://www.sqlite.org/asyncvfs.html

@emilazy
Copy link

emilazy commented Feb 12, 2020

That same document also says:

You lose the Durable property. With the default I/O backend of SQLite, once a write completes, you know that the information you wrote is safely on disk. With the asynchronous I/O, this is not the case. If your program crashes or if a power loss occurs after the database write but before the asynchronous write thread has completed, then the database change might never make it to disk and the next user of the database might not see your change.

I'm not sure if that also extends to using WAL mode in this way – if it does, then I think at the very least there should be a warning about potentially compromised data integrity, since it might only observably fail to pan out after people lose data ^^;

@mehcode
Copy link
Member

mehcode commented Feb 12, 2020

I think it's pretty clear that the rest of the document beyond the heading is defunct.

This documentation is retained for historical reference.


To assuage any concerns, the first iteration of support will most likely "just" use spawn_blocking or equivalent around potentially blocking calls. Next, we'll attempt to mimic PRAGMA synchronous=FULL which places an fsync after each commit. It wouldn't be too hard to support both the blocking and non-blocking styles as there is a very small amount of code we're talking about here.

Keep in mind, this may not even be worth it. I don't have any hard numbers yet to say "yes, doing this is a large enough performance increase".

@cljoly
Copy link

cljoly commented Feb 27, 2020

So our idea for making SQLite async is to operate it in WAL (write-ahead log) mode and run the blocking fsync() behind task::blocking().

If I understand correctly, your main concern is with syncing.

According to the WAL documentation:

Note that with PRAGMA synchronous set to NORMAL, the checkpoint is the only operation to issue an I/O barrier or sync operation

Then, why not operate in WAL mode and:

  1. set PRAGMA synchronous to NORMAL
  2. disable auto-checkpoint
  3. run checkpoint from the application, for instance in the task::blocking() you mention?

This way, no fsync appens outside of sqlx control. And it looks like there is good support for this kind of modification. From the same page:

Applications using WAL do not have to do anything in order to for these checkpoints to occur. But if they want to, […] they can turn off the automatic checkpoints and run checkpoints during idle moments or in a separate thread or process

To implement the run checkpoint from the application, this section recommends to use PASSIVE checkpoint mode. As outlined, this would be as “easy” as registering a hook triggered after each commit, so that sqlx decide whether checkpointing is required and if so, manually call, in task::blocking(), the function sqlite uses automatically when auto-checkpointing is enabled.

I hope my explanations are clear.

@mamcx
Copy link

mamcx commented Mar 3, 2020

Nope, this is treat sqlite as nosql. Data integrity is paramount in rdbms. As a user, I must NOT need to find that the sqlite driver do any kind of hack.

@cljoly
Copy link

cljoly commented Mar 3, 2020

If I understand the documentation correctly, the only thing that threatens data integrity would be setting PRAGMA synchronous to NORMAL. This would not even corrupt the database, but it is true that transaction could be reversed.
Do you @mamcx think the rest of what I outline threatens integrity?

@abonander
Copy link
Collaborator

abonander commented Mar 3, 2020

For our initial implementation we were going to checkpoint after every query execution, so data can only be lost from the current operation before control is returned to the application, which is already a normal concern with SQLite.

Honestly, if you want perfect durability you should not be using SQLite anyway.

@gortiz
Copy link

gortiz commented Mar 4, 2020

Why don't you let the user to decide when to checkout and vacuum, which mode he/she wants to use, etc?

@abonander
Copy link
Collaborator

I fully agree, but that's something that can be discussed after the initial implementation is done.

@mehcode
Copy link
Member

mehcode commented Mar 15, 2020

Support for SQLite is here:

#129


After experimenting with various methods, I've decided to execute all blocking operations on a worker thread. spawn_blocking or similar APIs are unacceptable for our situation because while a SQLite connection can be sent between threads, a single SQL program must be iterated/stepped continuously on the same thread. The iterative nature of SQLx means that this is impossible to guarantee without having our own worker thread.

Most of the SQLite API can be ran on the "current" thread. In effect, it's just the sqlite3_step API that gets pushed to the worker thread ( this is called when iterating rows on a query, for example ).


Initial testing seems to hold up fine even with high concurrency. I'd love assistance with additional testing. It's definitely possible I'm missing something.


@abonander I'm going to need your help for making sure the macro support for SQLite is done correctly. I added what I think should make it work.

We probably should extend test_type! to also generate tests that use the macros.

losfair pushed a commit to losfair/sqlx that referenced this issue Oct 1, 2022
* Add Docker build CI to pull requests

* Add SQLite document persistence through SQLx

* Update README to describe configuration variables

* Minor changes to README wording

* Update image size estimate listed in README

* Update frontend dependencies

* Add direct database tests and restructure code

* Clarify use of `SQLITE_URI` in Docker contexts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants