Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding the ability to prioritize tasks #519

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

simeq
Copy link

@simeq simeq commented Jul 27, 2024

This change introduces the possibility of using a task prioritization mechanism. This mechanism is disabled by default, and can be enabled using the enablePrioritization() method.

Reminders

  • Added/ran automated tests
  • Update README and/or examples
  • Ran mvn spotless:apply

cc @kagkarlsson open to any suggestions and comments regarding tests that would be worth adding

@kagkarlsson
Copy link
Owner

Thanks for updating the PR 👍. Will try and find some time to have a look at it soon!

@kagkarlsson
Copy link
Owner

Did a quick look through. I can see most lines come from the opt-in toggle 😅. Good job realigning the PR with master. I need to go through it more thoroughly, but a couple of reactions:

  • Ideally the compatibility test should test both variants of the priority-toggle
  • I like the instance-builder. Probably prefer dropping the set-prefix for the method-names
  • Might need to drop "not null" from the schema definitions, considering people upgrading will have existing data?

Another thing. It would be great to see the numbers for how polling performs, specifically for postgres, how many buffers are read to satisfy the query, with and without priority order.

A test like:

  • 10M executions not due, random priority
  • 10M executions due, random priority

Run the due-query in postgres with (analyze on, buffers on). Possibly with index-variations also (priority,execution_time) or (execution_time,priority).

Not sure if this is something you or someone else are up to, otherwise I need to run it through myself.

@simeq
Copy link
Author

simeq commented Aug 8, 2024

Thanks for looking into it!

In near days I will update the PR according to your suggestions, but got a question regarding the not null part.
Nowhere in the schema, priority is set to not null and I added:

Upgrading to 15.x

  • Add column priority and priority_execution_time_idx index to the database schema. See table definitions for postgresql, oracle or mysql. Note that when enablePrioritization() is used, the null value in order of prioritization is handled differently depending on the database used.

So the schema allows null values, but maybe make it just clearer that in this upgrade note we are talking about existing values?

About that testing of changes, didn't do int earlier, but I'm happy to try. If any problems occur for me, I will let you know.

Overall, it's a good exercise diving into this code, helped recently with understanding exactly how to improve performance 😄

@simeq simeq changed the title Adding the ability to prioritize tasks feat: Adding the ability to prioritize tasks Aug 8, 2024
@kagkarlsson
Copy link
Owner

In near days I will update the PR according to your suggestions, but got a question regarding the not null part.
Nowhere in the schema, priority is set to not null and I added:

Ah, I didn't check all schemas, just assumed after I think I saw one. Checked now, and looks like mssql has not null, but as you say, none other has it.

@GeorgEchterling
Copy link
Contributor

I think I read a discussion about the index usage with prioritization somewhere on this repo, but I can't find it. In case it's still relevant:

Have you considered splitting the "due task detection" from the picking step? I.e. something like this:

UPDATE scheduled_tasks
SET due = TRUE
WHERE NOT due
AND NOT picked
AND execution_time < NOW();

SELECT * FROM schedules_tasks
WHERE due
AND NOT picked
ORDER BY priority, execution_time;

Both queries could be optimized (even for arbitrary priority cardinality) using indices over (due, picked, execution_time) and (due, picked, priority, execution_time).

Also, this PR uses descending priorities. Older versions of MySQL/MariaDB don't support direction on index columns, which would prevent them from using the index when sorting by priority DESC, execution_time ASC. I'm not sure if that affects any other DBs.

@simeq
Copy link
Author

simeq commented Aug 16, 2024

I made some tests @kagkarlsson for the 10M executions due, wasn't certain what did you mean with "not due" executions, so happy to add them later 😄

TL;DR priority desc, execution_time asc is the correct index, enabling prioritization causes a reduction in performance of about 15 percent.

I conducted tests on:

  • GCP PostgreSQL 12 (4 vCPUs, 25 GB memory, SSD storage)
  • 4x GCP VMs (4 vCPUs, 15 GB memory, SSD storage)

Scheduler was configured with lock-and-fetch:

  • lowerLimitFractionOfThreads: 0.5
  • upperLimitFractionOfThreads: 4.0
  • threads: 50

I have scheduled 10M executions that were due, with random priority

And tested two types of indexes:

  • priority desc, execution_time asc
  • execution_time asc, priority desc

Results

Results for scheduler without prioritization

count mean 1m rate 5m rate 15m rate
vm1 2521123 2194.22 2341.86 2267.74 2153.32
vm2 2483663 2161.62 2289.32 2228.68 2101.48
vm3 2500572 2176.34 2316.60 2246.62 2121.06
vm2 2494642 2171.17 2307.60 2242.47 2128.78
total 10000000 8703.35 9255.38 8985.51 8504.64

Results for prioritization with index priority desc, execution_time asc

count mean 1m rate 5m rate 15m rate
vm1 2516239 1890.51 1972.18 1950.99 1923.13
vm2 2481189 1864.18 1951.05 1924.37 1898.35
vm3 2495400 1874.86 1951.23 1934.60 1896.35
vm2 2507172 1883.70 1968.52 1946.73 1921.82
total 10000000 7513.25 7842.98 7756.69 7639.65

Results for prioritization with index execution_time asc, priority desc
I just gave up, it was too slow...

count mean 1m rate 5m rate 15m rate
total 1600 19.76 13.23 4.62 1.68

Query plans - explain (analyze, buffers)

EXPLAIN (ANALYZE, BUFFERS)
SELECT task_name, task_instance
FROM scheduled_tasks WHERE picked = false and execution_time <= now()
ORDER BY priority desc, execution_time ASC FOR UPDATE SKIP LOCKED
LIMIT 100

Query plan without index on priority

Limit  (cost=645821.29..645822.54 rows=100 width=39) (actual time=11979.371..11979.485 rows=100 loops=1)
  Buffers: shared hit=113737, temp read=138357 written=206674
  ->  LockRows  (cost=645821.29..770819.29 rows=9999840 width=39) (actual time=11979.370..11979.476 rows=100 loops=1)
        Buffers: shared hit=113737, temp read=138357 written=206674
        ->  Sort  (cost=645821.29..670820.89 rows=9999840 width=39) (actual time=11979.345..11979.372 rows=100 loops=1)
              Sort Key: priority DESC, execution_time
              Sort Method: external merge  Disk: 547248kB
              Buffers: shared hit=113637, temp read=138357 written=206674
              ->  Seq Scan on scheduled_tasks  (cost=0.00..263634.60 rows=9999840 width=39) (actual time=0.014..2431.470 rows=10000000 loops=1)
                    Filter: ((NOT picked) AND (execution_time <= now()))
                    Buffers: shared hit=113637
Planning Time: 0.107 ms
Execution Time: 12091.458 ms

Query plan with index priority desc, execution_time asc

Limit  (cost=0.56..9.29 rows=100 width=39) (actual time=0.028..0.144 rows=100 loops=1)
  Buffers: shared hit=117 dirtied=9
  ->  LockRows  (cost=0.56..872617.26 rows=9999840 width=39) (actual time=0.027..0.135 rows=100 loops=1)
        Buffers: shared hit=117 dirtied=9
        ->  Index Scan using priority_execution_time_idx on scheduled_tasks  (cost=0.56..772618.86 rows=9999840 width=39) (actual time=0.021..0.084 rows=100 loops=1)
              Index Cond: (execution_time <= now())
              Filter: (NOT picked)
              Buffers: shared hit=17 dirtied=9
Planning Time: 0.283 ms
Execution Time: 0.220 ms

Query plan with index execution_time asc, priority desc

Limit  (cost=469587.40..469588.65 rows=100 width=40) (actual time=14596.903..14597.121 rows=100 loops=1)
  Buffers: shared hit=113737 dirtied=103971, temp read=138357 written=206674
  ->  LockRows  (cost=469587.40..553112.54 rows=6682011 width=40) (actual time=14596.902..14597.111 rows=100 loops=1)
        Buffers: shared hit=113737 dirtied=103971, temp read=138357 written=206674
        ->  Sort  (cost=469587.40..486292.43 rows=6682011 width=40) (actual time=14596.873..14596.898 rows=100 loops=1)
              Sort Key: priority DESC, execution_time
              Sort Method: external merge  Disk: 547248kB
              Buffers: shared hit=113637 dirtied=103971, temp read=138357 written=206674
              ->  Seq Scan on scheduled_tasks  (cost=0.00..214205.74 rows=6682011 width=40) (actual time=0.026..5012.498 rows=10000000 loops=1)
                    Filter: ((NOT picked) AND (execution_time <= now()))
                    Buffers: shared hit=113637 dirtied=103971
Planning Time: 0.178 ms
Execution Time: 14794.232 ms

Query plan when prioritization is disabled (ORDER BY execution_time ASC)

Limit  (cost=0.44..5.85 rows=100 width=35) (actual time=0.024..0.101 rows=100 loops=1)
  Buffers: shared hit=105 dirtied=2
  ->  LockRows  (cost=0.44..541357.98 rows=10000056 width=35) (actual time=0.024..0.092 rows=100 loops=1)
        Buffers: shared hit=105 dirtied=2
        ->  Index Scan using execution_time_idx on scheduled_tasks  (cost=0.44..441357.42 rows=10000056 width=35) (actual time=0.015..0.038 rows=100 loops=1)
              Index Cond: (execution_time <= now())
              Filter: (NOT picked)
              Buffers: shared hit=5
Planning Time: 0.258 ms
Execution Time: 0.125 ms

@simeq simeq requested a review from kagkarlsson August 17, 2024 17:11
@kagkarlsson
Copy link
Owner

Sorry I haven't followed up earlier. Good job on the testing! Excellent with a full test using concurrent schedulers and detailed statistics 👏.

wasn't certain what did you mean with "not due" executions, so happy to add them later

To make the testing more realistic we have to assume that there are a large number of executions which are not due (also high priority executions that are not due yet).

Index (priority,execution_time) will be better when most executions are due, i.e. execution-time have passed.
Index (execution_time,priority) will be better when most executions are not due, i.e. execution-time have not passed yet (future executions).

My assumption would be that the realistic scenario would be that there are more future executions than due. If there are throughput problems, there will however eventually be a significant amount of due executions, which is the scenario where priority is useful.

So for the testing, I think we should add at least the same amount of future executions to the table as there are due (maybe even a factor higher).
Due: 1M, future: 10M might be a better distribution? 🤔

@kagkarlsson
Copy link
Owner

kagkarlsson commented Aug 23, 2024

Have you considered splitting the "due task detection" from the picking step?

@GeorgEchterling not really. That would require an additional update and roundtrip to the database 🤔

(on the other hand, the performance will likely be more predictable)

@simeq
Copy link
Author

simeq commented Aug 23, 2024

Thanks for your explanation @kagkarlsson.

I ran the tests again, started the same instances and filled the scheduler with 1M due tasks (random -60 minutes) and 10M in the future (random +60 minutes), with random priorities from 1 to 10.

priority desc, execution_time asc is the correct index because PostgreSQL is still doing seq scan for execution_time asc, priority desc regardless of more tasks that are not due. But there is a significant drop in events rate when prioritization is enabled.

Results

Results for scheduler without prioritization

count mean 1m rate 5m rate 15m rate
vm1 252366 2274.00 2033.30 2186.72 2174.81
vm2 247850 2233.27 1989.09 2122.85 2104.44
vm3 251391 2265.19 2026.89 2173.77 2160.37
vm2 248393 2238.11 2004.79 2170.64 2164.12
total 1000000 9010.57 8054.07 8653.98 8603.74

Results for prioritization with index priority desc, execution_time asc

count mean 1m rate 5m rate 15m rate
vm1 250200 297.86 275.29 287.74 293.46
vm2 249400 296.91 274.74 286.37 288.67
vm3 250000 297.63 277.64 287.47 288.88
vm2 250400 298.10 278.52 288.00 289.24
total 1000000 1190.5 1106.19 1149.58 1160.25

Results for prioritization with index execution_time asc, priority desc

count mean 1m rate 5m rate 15m rate
vm1 252324 58.16 55.20 44.27 41.51
vm2 247834 58.18 55.19 44.27 41.51
vm3 252391 56.19 55.42 44.28 41.51
vm2 247451 58.18 55.20 44.27 41.51
total 1000000 230.71 221.01 177.09 166.04

Query plans

Query plan with index priority desc, execution_time asc was:

Limit  (cost=0.56..15.45 rows=100 width=38) (actual time=148.539..148.643 rows=100 loops=1)
  Buffers: shared hit=31277
  ->  LockRows  (cost=0.56..813440.53 rows=5464003 width=38) (actual time=148.538..148.633 rows=100 loops=1)
        Buffers: shared hit=31277
        ->  Index Scan using priority_execution_time_idx on scheduled_tasks  (cost=0.56..758800.50 rows=5464003 width=38) (actual time=148.519..148.562 rows=100 loops=1)
              Index Cond: (execution_time <= now())
              Filter: (NOT picked)
              Buffers: shared hit=31177
Planning Time: 0.113 ms
Execution Time: 148.676 ms

Query plan with index execution_time asc, priority desc was:

Limit  (cost=336782.19..336783.44 rows=100 width=38) (actual time=7200.015..7200.163 rows=100 loops=1)
  Buffers: shared hit=125100 dirtied=110046, temp read=3466 written=9883
  ->  LockRows  (cost=336782.19..386525.02 rows=3979426 width=38) (actual time=7200.014..7200.153 rows=100 loops=1)
        Buffers: shared hit=125100 dirtied=110046, temp read=3466 written=9883
        ->  Sort  (cost=336782.19..346730.76 rows=3979426 width=38) (actual time=7199.988..7200.013 rows=100 loops=1)
              Sort Key: priority DESC, execution_time
              Sort Method: external merge  Disk: 54112kB
              Buffers: shared hit=125000 dirtied=110023, temp read=3466 written=9883
              ->  Seq Scan on scheduled_tasks  (cost=0.00..184691.39 rows=3979426 width=38) (actual time=0.015..6648.807 rows=1000000 loops=1)
                    Filter: ((NOT picked) AND (execution_time <= now()))
                    Rows Removed by Filter: 10000001
                    Buffers: shared hit=125000 dirtied=110023
Planning Time: 0.369 ms
Execution Time: 7212.177 ms

@simeq
Copy link
Author

simeq commented Aug 23, 2024

Basically, I would say that the usage of prioritization depends on the usage scenario of the scheduler.

I got an instance of scheduler where there are millions of recurring tasks with a persistent schedule and a few millions of one-time tasks added with execution time now() once a day. So for this, I'm guessing that separate schedulers would be better for prioritization.

But in instances where we operate only on one-time tasks that are always added with execution time now() - this type of prioritization would be a suitable solution.

@simeq
Copy link
Author

simeq commented Sep 17, 2024

@kagkarlsson What would we do next with this PR? :)

@kagkarlsson
Copy link
Owner

I did some testing on my own and I think we probably need to add both indexes (or at least supply them). (i.e both (pririty,execution_time) and (execution_time,priority)

With some luck I can review your changes (and possibly contribute some) next week 🤞

@kagkarlsson
Copy link
Owner

How do you feel about enablePrioritization() vs enablePriority()? Isn't "priority" more common to use?

@simeq
Copy link
Author

simeq commented Sep 21, 2024

Thanks for looking into this :)

I'm good with changing the name to enablePriority()

@kagkarlsson
Copy link
Owner

I pushed some changes, addressing prioritization -> priority among other things. Would be great if you would have a look @simeq

One big question I have is:

Should a high int-value for priority mean higher priority or the reverse? 🤯
A lot of schedulers seem to use a low value for higher priority (and for some reason it was my initial inclination as well)

@simeq
Copy link
Author

simeq commented Oct 11, 2024

The changes are good for me @kagkarlsson, anything else I could help with?

@kagkarlsson
Copy link
Owner

kagkarlsson commented Oct 11, 2024

I think it is very close. I started a refactoring that I feel is a bit unfinished still. Slightly unrelated 😬

See 26e0492

e.g. (.instanceWithId is a new builder as well)

    scheduler.schedule(
        MY_TASK
            .instanceWithId("1045")
            .data(new MyTaskData(1001L))
            .scheduledTo(Instant.now().plusSeconds(5)));

i.e. stop using task.instance(..) in examples, and rather a static TaskDescriptor reference.

And also, if it makes sense, deprecated/reduce use of TaskWithDataDescriptor and TaskWithoutDataDescriptor (use plain TaskDescriptor instead, and the instance builder) (if it makes sense)

Update: I think I have completed this refactoring now.

@kagkarlsson
Copy link
Owner

How do we feel about these "defaults"? Users are free to use what suits them, as longs a it fits in the column..

public class Priority {
  public static final int HIGH = 90;
  public static final int MEDIUM = 50;
  public static final int LOW = 10;
}

…eprecating TaskWithDataDescriptor and TaskWithoutDataDescriptor. Updated examples.
@simeq
Copy link
Author

simeq commented Oct 14, 2024

Changes for TaskDescriptor are clear for me and looks good.

Those predefined priorities are great for default usage. Also, there are good description how value of field translates into priority.

For my part, it's important to have an ability to dynamically set value of priority.

@kagkarlsson
Copy link
Owner

Hopefully I will get some time on Friday to go through this PR one last time. If everything checks out, I will try to release it.

You are happy with the current version of the PR @simeq ?

@simeq
Copy link
Author

simeq commented Oct 22, 2024

I confirm my happiness with the current version @kagkarlsson :)

@kagkarlsson
Copy link
Owner

I am quickly going to check how hard it is to avoid touching priority column if priority is disabled. Any thoughts on that @simeq ? It is to avoid forcing existing users to update the schema. (I have another feature planned that also might require schema-changes)

@kagkarlsson
Copy link
Owner

I am adding some missing tests for result ordering for the different cases

@simeq
Copy link
Author

simeq commented Oct 28, 2024

On one hand, I think it's a good idea to be able to upgrade db-scheduler without database changes. On another, there would need to be explicit instructions on how to ALTER tables to work with priority, and should we have postgresql_tables.sql separate for with and without priority?

@kagkarlsson
Copy link
Owner

We will still update all the schema files with column priority, but moderate the upgrading instructions to say that it is recommended to add the column, but not necessary until priority is enabled.

@kagkarlsson
Copy link
Owner

(I am doing this as a service to long-time users that do not need priority. This way they can simply bump major and keep going.)

Copy link
Owner

@kagkarlsson kagkarlsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good now 👍

@simeq
Copy link
Author

simeq commented Oct 29, 2024

Thanks for the changes and comment it's clear for me now, and looks good @kagkarlsson 🙂

@kagkarlsson
Copy link
Owner

Will release soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants