Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IFNULL with indexes #1552

Open
rentalhost opened this issue Jul 12, 2024 · 0 comments
Open

Optimize IFNULL with indexes #1552

rentalhost opened this issue Jul 12, 2024 · 0 comments

Comments

@rentalhost
Copy link

Hello,

I've been studying SQLite using the Turso platform. There, every read or write operation is counted. So, if I run a query that doesn't match an existing index, a table scan is performed, consuming as many read credits as there are records in the table. This can be quite costly.

Let me share a real example from my experience:

CREATE TABLE "example_entries" (
	"id"		INTEGER NOT NULL UNIQUE,
	"name"		TEXT,
	"updatedAt"	INTEGER,
	"createdAt"	INTEGER NOT NULL,

	PRIMARY KEY("id" AUTOINCREMENT)
);

Whenever a record is created, only the createdAt field is populated. For simplicity, let's assume updatedAt remains NULL during creation.

If I run a query based on either updatedAt or createdAt, it results in a table scan because neither column is indexed. To address this, we can create indexes:

CREATE INDEX "example_entries.updatedAt" ON "example_entries" ("updatedAt" ASC);
CREATE INDEX "example_entries.createdAt" ON "example_entries" ("createdAt" ASC);

Great! Now, let's say I want to return the id, name, and the update date, or the creation date if the update date is NULL. This can be done with:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries`;

This works well, though we still have the table scan issue, but without a WHERE clause, there's not much that can be done.

Now, let's limit our query to find records updated or created after a certain timestamp:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries` WHERE IFNULL(`updatedAt`, `createdAt`) >= @timestamp;

Although this filter works, it still results in a table scan. From what I understand, SQLite can't optimize IFNULL with two columns to use indexes efficiently. To solve this, I need to "expand" the IFNULL like this:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries` WHERE `updatedAt` >= @timestamp OR (`updatedAt` IS NULL AND `createdAt` >= @timestamp);

This adds cognitive complexity for something that might be optimized internally by SQLite.

I have reported this issue to the SQLite team for their evaluation as well. You can find the post here: SQLite Forum Post.

I would love to hear the libsql team's perspective on this matter and if there's a possibility for such an optimization in libsql.

Thank you for your attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant