Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cliupdate sempahore locker #967

Open
alexalouit opened this issue Jul 21, 2017 · 11 comments
Open

cliupdate sempahore locker #967

alexalouit opened this issue Jul 21, 2017 · 11 comments

Comments

@alexalouit
Copy link

alexalouit commented Jul 21, 2017

Hi,

I found a issue (in my opinion).
Yesterday, my database was in trouble with many connections and simultaneous writings.
I noticed that there were ~30 cron processes (cliupdate) running.

I think a semaphore would have to be set up to prevent this.

@niol
Copy link
Collaborator

niol commented Jul 21, 2017

Yes I had done that in #597 but this was not merged.

@jtojnar
Copy link
Member

jtojnar commented Jul 21, 2017

What are the merits of locking, compared to UNIQUE constraint (#921)? At a glance, the constraint seems cleaner but I did not give it a proper thought yet.

@alexalouit
Copy link
Author

alexalouit commented Jul 21, 2017

I think file locker will not do useless query, is upper in application level.
Nevertheless, the two systems remain complementary.

This can prevent process and memory bombing from cron zombie/defunct processes.

@jtojnar
Copy link
Member

jtojnar commented Jul 21, 2017

I have originally intended to update the items in the races (hence the UPSERT mention) but since the probability of items changing between the races is low, the queries would indeed be useless. With that in mind, I can see the complementarity; I am not convinced that this is not a premature optimization not worth the added complexity, though.

@alexalouit
Copy link
Author

ALTER IGNORE TABLE `items` ADD UNIQUE INDEX (`uid` (150), `source`);

from ~111 619 to ~67 562 entries.

@alexalouit
Copy link
Author

alexalouit commented Jul 23, 2017

Same error this morning, multiple process cliupdate performs many sql queries (which are denied from the unique key).

@niol
Copy link
Collaborator

niol commented Jul 31, 2017

I think that any running update should make another one exit immediately to prevent both sources hammering and rogue processes. There are multiple ways of implementing this, I have proposed using file locks, another way would be to use unix sockets (not portable) or the database. I'll simplify my patch and submit a pull request if it has some chance of getting accepted. Otherwise, this issue should be closed.

@jtojnar
Copy link
Member

jtojnar commented Jul 31, 2017

One might also argue that locking should not be a concern of the application and be done on a higher level:

  • If I understand correctly, systemd timers only allow one instance of the service running, at least Persistent option hints so.
  • For traditional UNIX, flock or similar command can be used with cron.
  • For shared web hosts that do not allow running UNIX commands, creating an optional PHP wrapper simulating flock could be used.

Personally, I am not yet completely sure which level should this be tackled on but since Tobias rejected the original PR, I will side with the higher level solution.

@alexalouit
Copy link
Author

After several months of use, the use of:

  • unique constraint
  • semaphore per source/file by @jtojnar
  • expiration limit for cliupdate.php

did the trick. No more zombie process.

I can submit a pull request, what do you think?

@jtojnar
Copy link
Member

jtojnar commented Sep 15, 2017

@alexalouit Sure, we could use it.

@alexalouit
Copy link
Author

see #991

Works perfectly since months (~120 000 items, ~470 sources).
I'm not familiar with Selfoss, everything is correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants