Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remove_domain handler for MAM #3092

Merged
merged 6 commits into from
Apr 23, 2021
Merged

Conversation

arcusfelis
Copy link
Contributor

This PR addresses "Ensure efficient indexing and removal of persistent data by domain name".
MAM already have proper indexes in the mam_server_user table. We should just LEFT JOIN it. OK, we can't LEFT JOIN in DELETEs. But we can use WHERE IN instead.

Proposed changes include:

  • remove_domain/3 handler for MAM/MUC MAM.
  • nobody runs this hook in this PR. @DenysGonchar will code that part.
  • a tip in schema. Because we use the same table to track MAM archive IDs for both MAM and MUC MAM.

@codecov
Copy link

codecov bot commented Apr 22, 2021

Codecov Report

Merging #3092 (8c4266b) into master (d15d4d9) will increase coverage by 0.02%.
The diff coverage is 79.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3092      +/-   ##
==========================================
+ Coverage   79.02%   79.04%   +0.02%     
==========================================
  Files         386      386              
  Lines       31830    31829       -1     
==========================================
+ Hits        25153    25160       +7     
+ Misses       6677     6669       -8     
Impacted Files Coverage Δ
src/mam/mod_mam_rdbms_arch.erl 49.47% <50.00%> (+1.02%) ⬆️
src/auth/ejabberd_auth.erl 75.40% <100.00%> (ø)
src/mam/mod_mam_muc_rdbms_arch.erl 94.79% <100.00%> (+0.05%) ⬆️
src/mongoose_hooks.erl 92.19% <100.00%> (+0.70%) ⬆️
...bal_distrib/mod_global_distrib_hosts_refresher.erl 71.69% <0.00%> (-1.89%) ⬇️
src/mod_bosh_socket.erl 78.88% <0.00%> (-0.32%) ⬇️
src/ejabberd_c2s.erl 89.43% <0.00%> (+0.07%) ⬆️
src/pubsub/mod_pubsub_db_mnesia.erl 92.82% <0.00%> (+0.42%) ⬆️
src/muc_light/mod_muc_light.erl 84.27% <0.00%> (+0.87%) ⬆️
src/pubsub/node_hometree.erl 83.33% <0.00%> (+5.55%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d15d4d9...8c4266b. Read the comment docs.

@@ -102,6 +103,7 @@ start_hooks(Host, _Opts) ->
ejabberd_hooks:add(mam_archive_size, Host, ?MODULE, archive_size, 50),
ejabberd_hooks:add(mam_lookup_messages, Host, ?MODULE, lookup_messages, 50),
ejabberd_hooks:add(mam_remove_archive, Host, ?MODULE, remove_archive, 50),
ejabberd_hooks:add(remove_domain, Host, ?MODULE, remove_domain, 50),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, make a hooks/0 function that returns a list of hooks and use ejabbed_hooks:add/1 and ejabberd_hooks:delete/1 functions to start/stop module

@@ -86,6 +87,7 @@ start_hooks(Host, _Opts) ->
ejabberd_hooks:add(mam_muc_archive_size, Host, ?MODULE, archive_size, 50),
ejabberd_hooks:add(mam_muc_lookup_messages, Host, ?MODULE, lookup_messages, 50),
ejabberd_hooks:add(mam_muc_remove_archive, Host, ?MODULE, remove_archive, 50),
ejabberd_hooks:add(remove_domain, Host, ?MODULE, remove_domain, 50),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, make a hooks/0 function that returns a list of hooks and use ejabbed_hooks:add/1 and ejabberd_hooks:delete/1 functions to start/stop module

@@ -112,6 +115,11 @@ register_prepared_queries() ->
mongoose_rdbms:prepare(mam_muc_archive_remove, mam_muc_message, [room_id],
<<"DELETE FROM mam_muc_message "
"WHERE room_id = ?">>),
mongoose_rdbms:prepare(mam_muc_remove_domain, mam_muc_message, ['mam_server_user.server'],
<<"DELETE FROM mam_muc_message "
"WHERE room_id IN (SELECT id FROM mam_server_user where server = ?)">>),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how optimal is this query? what if we have over 100k records selected by SELECT id FROM mam_server_user where server = ?

Copy link
Contributor Author

@arcusfelis arcusfelis Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should work.

ids would go into a temp table in this case. And use similar logic as left joins.

It would be a pretty long transaction though. And it could deadlock still (as always).


run_remove_domain() ->
Acc = #{},
rpc(mim(), ejabberd_hooks, run_fold,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_domain hook code is already merged into master. I know it was not in original scope, and it's not critical to be a part of this PR.
but please change this rpc call to mongoose_hooks:remove_domain/2 in the following PR.

Copy link
Collaborator

@DenysGonchar DenysGonchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The changes look good, I've added just a few minor comments

Copy link
Collaborator

@DenysGonchar DenysGonchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@DenysGonchar DenysGonchar merged commit 1436629 into master Apr 23, 2021
@DenysGonchar DenysGonchar deleted the mu-delete-domain-fast branch April 23, 2021 16:15
@Premwoik Premwoik added this to the 5.0.0 milestone Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants