Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accum stage six - is it the last? #1398

Merged
merged 25 commits into from
Oct 5, 2017
Merged

Accum stage six - is it the last? #1398

merged 25 commits into from
Oct 5, 2017

Conversation

bartekgorny
Copy link
Collaborator

  • consistent use of accumulator as a storage (mostly append-only)
  • extended reach of accumulators along processing chain, as far as it makes sense
  • stanza/data to be sent passed along an accumulator
  • cleaned up and refactored internal broadcasting
  • cleaned up s2s_in, implemented accumulators in this part of chain

Seems this is it...

{Act, _, NStateData} = send_and_maybe_buffer_stanza(Acc,
{From, To, FinalPacket},
StateData),
case Act of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

The expression on line 1227 and column 21 is nested beyond the maximum level of 3.

@@ -810,11 +812,17 @@ do_open_session(El, JID, StateData) ->
?INFO_MSG("(~w) Opened session for ~s", [StateData#state.socket, jid:to_binary(JID)]),
Res = jlib:make_result_iq_reply(El),
Packet = {jid:to_bare(StateData#state.jid), StateData#state.jid, Res},
case send_and_maybe_buffer_stanza(Packet, StateData, wait_for_session_or_sm) of
{_, _, NStateData, _} ->
Acc = mongoose_acc:new(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see function do_open_session is called only from maybe_open_session. This one is only called from wait_for_session_or_sm. Would it be worth to create the acc (from the stanza user sent) before maybe_open_session is called and pass as an argument?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

% to be removed
{SendResult, BufferedStateData} =
send_and_maybe_buffer_stanza({J1, J2, mod_amp:strip_amp_el_from_request(El)}, State),
% if we have to check packet before sending we should do it much earlier, when it is
% still accumulator
mod_amp:check_packet(El, result_to_amp_event(SendResult)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the Acc is not passed to mod_amp:check_packet if we have it now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because mod_amp doesn't yet know what to do with it. It certainly should - now that we have accumulators mod_amp doesn't need to use stanza to carry internal info. But rewriting mod_amp is beyond the scope of this project.

@@ -3011,6 +3040,9 @@ re_route_packets(Buffer) ->
|| {From, To, Packet} <- lists:reverse(Buffer)],
ok.

maybe_enter_resume_session(StateData) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exactly do we need this helper function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes code cleaner, I guess. In nearly all cases stream management id is taken from state, so there is no need to pass it separately.

Acc = mongoose_acc:require(iq_query_info, Acc0),
IQ = mongoose_acc:get(iq_query_info, Acc),
El = mongoose_acc:get(element, Acc),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind using element from passed arg and from Acc as before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question. In Acc we have what came in, and we don't want to change it. When we want to send something out - which may be entirely different thing - we pass it as arg. Hence this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now I get it. Thanks!

@@ -45,7 +45,7 @@ terminate(M, _F, _L) ->

terminate(M, received, _F, _L) ->
%% ?ERROR_MSG("ZZZ terminate accumulator ~p ~p", [F, L]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no :) actually I don't think we really need mogoose_acc:terminate anymore.

Copy link
Contributor

@michalwski michalwski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left some questions in the code.

@@ -37,15 +37,16 @@ handle_notify(_Msg, State) ->
{nosend, State}.

handle_info({route, _From, _To, Acc}, State) ->
handle_msg(mongoose_acc:get(name, Acc), Acc, State);
handle_info({route, _From, _To, Acc, mongoose_acc:get(element, Acc)}, State);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Ignored variable is being used on line 40 and column 32.

@@ -37,15 +37,16 @@ handle_notify(_Msg, State) ->
{nosend, State}.

handle_info({route, _From, _To, Acc}, State) ->
handle_msg(mongoose_acc:get(name, Acc), Acc, State);
handle_info({route, _From, _To, Acc, mongoose_acc:get(element, Acc)}, State);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Ignored variable is being used on line 40 and column 25.

@@ -964,14 +968,14 @@ process_iq(From, To, Acc, Packet) ->
ResIQ = Module:Function(From, To, IQ),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Remove the dynamic function call on line 968. Only modules that define callbacks should make dynamic calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wontfix - would require a rewrite of the whole iq handling machinery, which is not within current scope of work.

@@ -106,21 +104,19 @@ process_iq(From, To, Acc0) ->
if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Replace the 'if' expression on line 104 with a 'case' expression or function clauses.

@@ -106,21 +104,19 @@ process_iq(From, To, Acc0) ->
if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

The expression on line 104 and column 21 is nested beyond the maximum level of 3.

{'EXIT', Reason} ->
?ERROR_MSG("~p", [Reason]);
ResIQ ->
{Acc1, ResIQ} ->
if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Replace the 'if' expression on line 146 with a 'case' expression or function clauses.

From :: ejabberd:jid(), To :: ejabberd:jid(), Acc :: mongoose_acc:t(),
IQ :: ejabberd:iq()) -> mongoose_acc:t() | {'error', 'lager_not_running'}.
process_iq(_Host, Module, Function, From, To, Acc, IQ) ->
case catch Module:Function(From, To, Acc, IQ) of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Remove the dynamic function call on line 142. Only modules that define callbacks should make dynamic calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wontfix - would require a rewrite of the whole iq handling machinery, which is not within current scope of work.

@@ -964,14 +968,14 @@ process_iq(From, To, Acc, Packet) ->
ResIQ = Module:Function(From, To, IQ),
if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Replace the 'if' expression on line 969 with a 'case' expression or function clauses.

@@ -964,14 +968,14 @@ process_iq(From, To, Acc, Packet) ->
ResIQ = Module:Function(From, To, IQ),
if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

The expression on line 969 and column 21 is nested beyond the maximum level of 3.

Host = To#jid.lserver,
case ets:lookup(?IQTABLE, {XMLNS, Host}) of
[{_, Module, Function}] ->
case Module:Function(From, To, IQ) of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Remove the dynamic function call on line 104. Only modules that define callbacks should make dynamic calls.

@bartekgorny bartekgorny force-pushed the accum-stage-six branch 2 times, most recently from 408ce22 to 89f5e80 Compare September 12, 2017 15:35
Host = To#jid.lserver,
case ets:lookup(sm_iqtable, {XMLNS, Host}) of
[{_, Module, Function}] ->
case Module:Function(From, To, IQ) of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Elvis:

Remove the dynamic function call on line 969. Only modules that define callbacks should make dynamic calls.

@fenek fenek added this to the MongooseIM 2.1.0 milestone Sep 25, 2017


test_save_acc(#{type := <<"chat">>} = Acc, _State) ->
random:seed(erlang:phash2([node()]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random is deprecated. Use rand instead.

@@ -58,7 +58,7 @@ In between the XMPP domain (`StateData#state.server`) and handler arguments the
All handlers attached to the hook have to accept a list as a first argument and also return a list - possibly the same list
plus one or more items.

Note, that `ejabberd_hooks:run/3` and `ejabberd_hooks:run_fold/4` **are not interchangeable**.
Note that `ejabberd_hooks:run/3` and `ejabberd_hooks:run_fold/4` **are not interchangeable**.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true, as run actually executes run_fold. In general, I think it is a good idea to update whole document and remove examples and descriptions with run.

* take return value from the acc the hook call returned
* pass the modified accumulator on

Handlers should store their return values in the accumulator; there are three ways to do it:
* if it is a one-off value which is to be discarded later use `mongoose_acc:put(result, Value, Acc)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is "later"?

@@ -1,10 +1,10 @@
# Accumulators

XMPP stanza processing starts in the `ejabberd_c2s` module, which receives the stanza from a socket.
XMPP stanza processing starts in the `ejabberd_c2s` module, which receives the stanza from a socket, or in `ejabberd_s2s_in` which receives stanzas from other MIM clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(...) from federated XMPP clusters.

The stanza is processed and eventually it and/or other messages are sent out, either to the original sender, to another c2s process within the same MongooseIM installation, or to another XMPP server.

At the beginning of the main processing chain, an accumulator is created containing the stanza and other values.
It is then passed through all the stages, until it reaches the end of its life.
At the beginning of the main processing chain an accumulator is created containing the stanza and other values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What values? Would be cool to enumerate basic fields available by default.

@@ -37,15 +37,17 @@ handle_notify(_Msg, State) ->
{nosend, State}.

handle_info({route, _From, _To, Acc}, State) ->
handle_msg(mongoose_acc:get(name, Acc), Acc, State);
handle_msg(mongoose_acc:get(name, Acc), Acc, mongoose_acc:get(element, Acc), State);
%% handle_info({route, From, To, Acc, mongoose_acc:get(element, Acc)}, State);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, should we leave commented code like this? :) With hooks recording it is quite understandable but not here.

maps:put(Key, Val, Acc).

-spec get(any()|[any()], t()) -> any().
get(to_send, Acc) ->
get(to_send, Acc, get(element, Acc));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that getting element as fallback was needed by Konrad?

% read-only; accumulator is for accumulating)
case maps:is_key(Key, Acc) of
true ->
%% ?WARNING_MSG("Overwriting existing key \"~p\" in accumulator,"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs state that warning will be logged...

-spec strip(t()) -> t().
strip(Acc) ->
El = get(to_send, Acc),
strip(Acc, get(element, Acc)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of send_result then, if only element remains after stripping?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more things that remain:
[ref, timestamp, from, from_jid, to, to_jid, persistent_properties]
Should send_result be forwarded too? I don't know, we can start an interesting discussion.

from_element(El) when is_tuple(El) ->
Name = <<"broadcast">>,
Type = element(1, El),
% ref and timestamp will be filled in by strip/2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shouldn't we fill them here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very specific function. Normally from_element is called when we are just starting stanza processing, so it creates a new accumulator and consequently have to generate ref and timestamp. This one is always called when a stanza triggers a broadcast, it is always called from strip, which means we already have an accumulator with ref and timestamp.

Apart from being able to notify the rest of the system that some event has happened by running a hook with `ejabberd_hooks:run/3`, it's also possible to [fold](#sidenote-folds) over a sequence of handlers with `ejabberd_hooks:run_fold/4`.
Like an ordinary `lists:foldl/3`, `ejabberd_hooks:run_fold/4` also requires an initial value to be passed to the function.

Hook handlers are called by "folding", meaning iterating over the list of handlers passing them a set of arguments and some initial value which gets modified and returned by every handler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hook handlers are called by "folding".
This means that each handler on a list is passed a set of arguments and an initial value that it then modifies, returns and hands over to the next handler in line.

@@ -54,13 +53,10 @@ ListOfSomething = ejabberd_hooks:run_fold(a_certain_hook,
StateData#state.server])
```

In between the XMPP domain (`StateData#state.server`) and handler arguments the initial value of the accumulator being passed through the sequence of handlers is inserted - in this case an empty list (`[]`).
All handlers attached to the hook have to accept a list as a first argument and also return a list - possibly the same list
In between the XMPP domain (`StateData#state.server`) and handler arguments is the initial value of the accumulator being passed through the sequence of handlers is inserted - in this case an empty list (`[]`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial value of the accumulator being passed through the sequence of handlers (in this case an empty list ([])) is inserted between the XMPP domain (StateData#state.server) and handler arguments.

has `get/2`, `get/3`, `put/3` etc.
It is instantiated with an incoming stanza, passed along throughout the processing
chain, supplied to and returned from hook calls, and terminated when stanza is leaving MongooseIM.
MongooseIM uses a dedicated data structure to accumulate results(see ["Accumulators"](accumulators.md)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongooseIM uses a dedicated data structure to accumulate results (see "Accumulators").

It is instantiated with an incoming stanza, passed along throughout the processing
chain, supplied to and returned from hook calls, and terminated when stanza is leaving MongooseIM.
MongooseIM uses a dedicated data structure to accumulate results(see ["Accumulators"](accumulators.md)).
This data structure is implemented in `mongoose_acc` module, which has a map-like interface: it has `get/2`, `get/3`, `put/3` etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This data structure is implemented in the mongoose_acc module, which has a map-like interface: it has get/2, get/3, put/3 etc.

@@ -91,9 +80,9 @@ The right way to use hooks is therefore:
* pass the modified accumulator on

Handlers should store their return values in the accumulator; there are three ways to do it:
* if it is a one-off value which is to be discarded later use `mongoose_acc:put(result, Value, Acc)`
* if it is a one-off value whch doesn't need to be passed on along with the accumulator (can be overwritten any time), `mongoose_acc:put(result, Value, Acc)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is a one-off value whch doesn't need to be passed on along with the accumulator (can be overwritten any time), use mongoose_acc:put(result, Value, Acc)

* name - <<"message">>, <<"presence">> or <<"iq">>
* type - e.g. <<"set">> (in iq)
* attrs - attributs of the root xml element of stanza
* from, to - full jids of sender and recipient in binary form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from, to - full jids of the sender and recipient in a binary form

@@ -61,6 +71,10 @@ Acc2 = mongoose_acc:add_prop(myprop, 123, Acc1),
V = mongoose_acc:get_prop(myprop, Acc2).
```

The rationale behind stripping an accumulator is that many values store in it are context-dependend.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale behind stripping an accumulator is that many values stored in it are context-dependend.

@@ -61,6 +71,10 @@ Acc2 = mongoose_acc:add_prop(myprop, 123, Acc1),
V = mongoose_acc:get_prop(myprop, Acc2).
```

The rationale behind stripping an accumulator is that many values store in it are context-dependend.
For example, `user` and `server` refer to the owner of the c2s process, in another one user will be different (and sometimes server).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, in a c2s process user and server refer to its owner, while elsewhere the user will be different (or sometimes even a server). ?

@@ -103,7 +117,7 @@ The notion of an exit point is a bit vague, because sending a stanza out of Mong
A single stanza entering the system creates an acc, but then may result in multiple stanzas going out - for example changing a privacy list triggers presence updates and a roster push.
It is in fact hard to determine when the process has really been finished.
The approach is to pass along one acc and, when something is just about to be sent out, either send a text and return the acc or pass on a clone of the acc and return the original one.
This gives us an option to record a fact that a stanza has been sent out.
This gives us an option to record a fact that a stanza has been sent out: every sending function calls `mongoose_acc:record_sending`, which gives you an opportunity to consult `send_result` key to check what send attempts have been made and with what effect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives us an option to record a fact that a stanza has been sent out.
Every sending function calls mongoose_acc:record_sending, which gives you an opportunity to consult the send_result key to check what send attempts have been made and with what effect.

another fresh accumulator out of the data to be sent, stripping it of all caches and the likes but preserving original ref and
timestamp (`mongoose_acc:strip/2`), or stanza is stored as offline, or is sent out to another XMPP server via `ejabberd_s2s`.
Eventually there are three things that may happen:
* the accumulator is sent to another c2s process, pubsub node or muc room - in this case we produce another fresh accumulator out of the data to be sent, stripping it of all caches and the likes but preserving original ref and timestamp and those attrs that should be carried on (`mongoose_acc:strip/2`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the accumulator is sent to another c2s process, pubsub node or muc room - in this case we produce another fresh accumulator out of the data to be sent, stripping it of all caches and the likes but preserving the original ref and timestamp and those attrs that should be carried on (mongoose_acc:strip/2)

@fenek fenek merged commit 1372967 into master Oct 5, 2017
@fenek fenek deleted the accum-stage-six branch October 5, 2017 15:18
Under the hood it calls the same handlers with an empty accumulator.
This is deprecated and some day will be removed.

### Error handling in hooks

Hooks are meant to decouple modules; in other words, the caller signals that some event took place or that it intends to use a certain feature or a set of features, but how and if those features are implemented is beyond its interest.
Hooks don't therefore use the "let it crash" approach. Instead it is rather like "fire-and-forget", more similar in principle to the `Pid ! signal` way.
Fro that reason hook don't use the "let it crash" approach. Instead it is rather like "fire-and-forget", more similar in principle to the `Pid ! signal` way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that reason..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants