Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change EXIT_PROCESSOR_SLA_MARGIN to EXIT_PROCESSOR_SLA_SECONDS #1491

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

souradeep-das
Copy link
Contributor

@souradeep-das souradeep-das commented Apr 27, 2020

#1399

Overview

This unifies sla_margin to hold values in seconds instead of the number of blocks and renames EXIT_PROCESSOR_SLA_MARGIN to EXIT_PROCESSOR_SLA_SECONDS. The internal logic to detect invalid exits has also been changed to use timestamps instead of number of blocks

EXIT_PROCESSOR_SLA_MARGIN_FORCED hasn't been renamed

Changes

  • EXIT_PROCESSOR_SLA_MARGIN to EXIT_PROCESSOR_SLA_SECONDS
  • Changing eth_height_now to eth_timestamp_now
  • Changed logic to work with seconds
  • Modified Tests
  • EXIT_PROCESSOR_SLA_MARGIN_FORCED is unchanged till now

# get exits which are still invalid and after the SLA margin
late_invalid_exits =
invalid_exits
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} -> eth_height + sla_margin <= eth_height_now end)
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} ->
eth_height + sla_seconds / ethereum_block_time_seconds <= eth_height_now
Copy link
Contributor Author

@souradeep-das souradeep-das Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good way or should the timestamp of each exit be stored in the DB and retrieved here to check if sla_seconds is elapsed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for this PR since it's still the existing behaviour, but could you maybe raise this discussion separately in the engineering channel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change the logic to exit_timestamp + sla_seconds <= time_now when #1495 is merged

@coveralls
Copy link

coveralls commented Apr 28, 2020

Coverage Status

Coverage increased (+0.009%) to 78.295% when pulling 99cf072 on souradeep/margin_to_seconds into 2e2893c on master.

@@ -44,7 +44,7 @@ defmodule OMG.Watcher.ExitProcessor.Core do

use OMG.Utils.LoggerExt

@default_sla_margin 10
@default_sla_seconds 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure should the value reflect the original value, which I think should be 10 block * 15s per block = 150?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this only takes place on tests. @souradeep-das can we confirm this - raise a value and whether tests fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, I think these are for tests. Fails when sla_seconds is >10

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is for tests then let's remove it. we should also remove the default setting below:
sla_seconds \\ @default_sla_seconds

# get exits which are still invalid and after the SLA margin
late_invalid_exits =
invalid_exits
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} -> eth_height + sla_margin <= eth_height_now end)
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} ->
eth_height + sla_seconds / ethereum_block_time_seconds <= eth_height_now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for this PR since it's still the existing behaviour, but could you maybe raise this discussion separately in the engineering channel?

# get exits which are still invalid and after the SLA margin
late_invalid_exits =
invalid_exits
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} -> eth_height + sla_margin <= eth_height_now end)
|> Enum.filter(fn {_, %ExitInfo{eth_height: eth_height}} ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're moving away from single pipes. Could you update invalid_exits |> Enum.filter(fn... to Enum.filter(invalid_exits, fn...? Thanks!

@@ -18,8 +18,8 @@ defmodule OMG.Watcher.ReleaseTasks.SetExitProcessorSLAMargin do
require Logger
@app :omg_watcher

@system_env_name_margin "EXIT_PROCESSOR_SLA_MARGIN"
@app_env_name_margin :exit_processor_sla_margin
@system_env_name_margin "EXIT_PROCESSOR_SLA_SECONDS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that before merging this PR, we'll need to update our infra to have this environment variable so we don't break the CI/CD environment.

@unnawut
Copy link
Contributor

unnawut commented Apr 29, 2020

Also, could you add this change to https://github.com/omisego/elixir-omg/blob/master/CHANGELOG.md under the Unreleased header?

Copy link
Contributor

@pnowosie pnowosie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@@ -44,7 +44,7 @@ defmodule OMG.Watcher.ExitProcessor.Core do

use OMG.Utils.LoggerExt

@default_sla_margin 10
@default_sla_seconds 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this only takes place on tests. @souradeep-das can we confirm this - raise a value and whether tests fails?

@@ -79,10 +79,13 @@ defmodule OMG.Watcher.ExitProcessor.StandardExit do
exits_invalid_by_ife = get_invalid_exits_based_on_ifes(active_exits, tx_appendix)
invalid_exits = active_exits |> Map.take(invalid_exit_positions) |> Enum.concat(exits_invalid_by_ife) |> Enum.uniq()

ethereum_block_time_seconds = OMG.Eth.Configuration.ethereum_block_time_seconds()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only here we introduce this antipattern ;)
Please restore it as init param - which will be set where others are - and pass it to the Core's init (Core module serves usually as GenServer state). So you could retrieve its value as you do with sla_seconds

@system_env_name_margin "EXIT_PROCESSOR_SLA_MARGIN"
@app_env_name_margin :exit_processor_sla_margin
@system_env_name_margin "EXIT_PROCESSOR_SLA_SECONDS"
@app_env_name_margin :exit_processor_sla_seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too verbose I think :)

Suggested change
@app_env_name_margin :exit_processor_sla_seconds
@app_env_name :exit_processor_sla_seconds

exit_processor_sla_margin_forced: exit_processor_sla_margin_forced,
metrics_collection_interval: metrics_collection_interval,
min_exit_period_seconds: min_exit_period_seconds,
ethereum_block_time_seconds: ethereum_block_time_seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore this and pass through as I suggested in exit_processor/standard_exit.ex

config/test.exs Outdated
@@ -138,7 +138,7 @@ config :omg_watcher,
block_getter_loops_interval_ms: 50,
# NOTE `exit_processor_sla_margin` can't be made shorter. At 3 it sometimes
# causes :unchallenged_exit because `geth --dev` is too fast
exit_processor_sla_margin: 5,
exit_processor_sla_seconds: 75,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is test - prev we had 1 secs blocks - is it OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes yes 1 sec blocks. Updating!

# get exits which are still invalid and after the SLA margin
# temporarily assigning ethereum_block_time_seconds= 1, will be removed on merge of #1495
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1495 is now merged

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks O! Updating.

DevHelper.wait_for_root_chain_block(eth_height + exit_processor_sla_margin, @timeout)
exit_processor_sla_seconds = Application.fetch_env!(:omg_watcher, :exit_processor_sla_seconds)
exit_processor_sla_ms = exit_processor_sla_seconds * 1000
Process.sleep(exit_processor_sla_ms)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fine? 🙇

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, we're really waiting for the seconds to come

@souradeep-das souradeep-das marked this pull request as ready for review June 1, 2020 15:35
@souradeep-das
Copy link
Contributor Author

Also @pnowosie asking for a re-review to confirm the latest changes. Thanks 🙇

@unnawut unnawut added the upgrade path required Need to inform infra and users of upgrade path, e.g. running an extra command or config change. label Jun 2, 2020
@unnawut
Copy link
Contributor

unnawut commented Jun 2, 2020

Adding upgrade path required label so we can collect them up and inform infra/users easily of manual upgrade tasks needed.

Copy link
Contributor

@unnawut unnawut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

DevHelper.wait_for_root_chain_block(eth_height + exit_processor_sla_margin, @timeout)
exit_processor_sla_seconds = Application.fetch_env!(:omg_watcher, :exit_processor_sla_seconds)
exit_processor_sla_ms = exit_processor_sla_seconds * 1000
Process.sleep(exit_processor_sla_ms)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, we're really waiting for the seconds to come

config/test.exs Outdated
# NOTE `exit_processor_sla_margin` can't be made shorter. At 8 it sometimes
# causes unchallenged exits events because `geth --dev` is too fast
exit_processor_sla_margin: 10,
# NOTE `exit_processor_sla_margin` can't be made shorter. At 3 it sometimes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that you also checked and updated this 👍 🙏

@souradeep-das
Copy link
Contributor Author

@unnawut do we need to update any env vars?

@unnawut
Copy link
Contributor

unnawut commented Jun 10, 2020

@unnawut do we need to update any env vars?

Yeah the ones you have in the infra PR should suffice

utxos_to_check: utxos_to_check,
utxo_exists_result: utxo_exists_result,
blocks_result: blocks
},
%__MODULE__{} = state
)
when not is_nil(eth_height_now) do
when not is_nil(eth_timestamp_now) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this guard is irrelevant and should be removed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed 96d9e31

Comment on lines 629 to 630
defp sla_seconds_safe?(exit_processor_sla_seconds, min_exit_period_seconds),
do: exit_processor_sla_seconds < min_exit_period_seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defp sla_seconds_safe?(exit_processor_sla_seconds, min_exit_period_seconds),
do: exit_processor_sla_seconds < min_exit_period_seconds
defp sla_seconds_safe?(exit_processor_sla_seconds, min_exit_period_seconds) do
exit_processor_sla_seconds < min_exit_period_seconds
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function spreads over multiple lines, let's allow it to do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 70 to 74
def get_invalid(
%Core{sla_seconds: sla_seconds} = state,
utxo_exists?,
eth_timestamp_now
) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_invalid(
%Core{sla_seconds: sla_seconds} = state,
utxo_exists?,
eth_timestamp_now
) do
def get_invalid(state, utxo_exists?, eth_timestamp_now) do
state.sla_seconds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 96d9e31

@@ -48,7 +48,7 @@ defmodule OMG.Watcher.ExitProcessor.CanonicityTest do
test "none if input never spent elsewhere",
%{processor_filled: processor} do
assert {:ok, []} =
%ExitProcessor.Request{blknum_now: 1000, eth_height_now: 5}
%ExitProcessor.Request{blknum_now: 1000, eth_timestamp_now: 5}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm... how does this make sense? what is eth_timestamp_now 5?

RootChainHelper.start_exit(tx_utxo_pos, txbytes, proof, alice.addr)
|> DevHelper.transact_sync!()

IntegrationTest.wait_for_byzantine_events([%Event.InvalidExit{}.name], @timeout)

exit_processor_sla_margin = Application.fetch_env!(:omg_watcher, :exit_processor_sla_margin)
DevHelper.wait_for_root_chain_block(eth_height + exit_processor_sla_margin, @timeout)
exit_processor_sla_seconds = Application.fetch_env!(:omg_watcher, :exit_processor_sla_seconds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exit_processor_sla_seconds = Application.fetch_env!(:omg_watcher, :exit_processor_sla_seconds)
exit_processor_sla_seconds = OMG.Watcher.Configuraion.exit_processor_sla_seconds()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 96d9e31

Copy link
Contributor

@InoMurko InoMurko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few things:

  • this PR changes a API endpoint (cc @T-Dnzt @unnawut ). A property is removed and replaced with a different one, the new one has a different meaning.
  • I've seen eth_timestamp_now. I don't exactly understand what it means and how it's used. The tests use values like 5,13,... What is this? Seems like a very odd timestamp.

@souradeep-das
Copy link
Contributor Author

Hey @InoMurko! Thanks for the review 🙇
So, eth_timestamp_now is mostly storing the timestamp ( eth_height_now ) replacing eth_height_now we had previously.

Previously we had -

late_invalid_exits =
      Enum.filter(invalid_exits, fn {_, %ExitInfo{eth_height: eth_height}} ->
        eth_height + sla_margin <= eth_height_now

Since, we added the exit timestamp already, using it for the check and checking with timestamp(eth_height_now)

late_invalid_exits =
       Enum.filter(invalid_exits, fn {_, %ExitInfo{block_timestamp: block_timestamp}} ->
        block_timestamp + sla_seconds <= eth_timestamp_now

assigning the actual timestamp of the block here,
https://github.com/omgnetwork/elixir-omg/pull/1491/files#diff-f1ce1871bf4b96d60bda0fbd02e3891cR632

For the tests,
trying for the exit_timestamp to be similar (in value) to the exit_height, mostly fixing the value of the exit_timestamp for the tests to constant like we do for eth_height.
And then passing values for eth_timestamp_now as the number of seconds that is ++ from the block_timestamp (exit_timestamp)
https://github.com/omgnetwork/elixir-omg/pull/1491/files#diff-feeb2e7cdff5093e568214f358154985R46

Should eth_timestamp_now be renamed for clarity?

@souradeep-das
Copy link
Contributor Author

Do we need a successive change for the change in API endpoint? @InoMurko @unnawut @T-Dnzt

@unnawut
Copy link
Contributor

unnawut commented Jun 25, 2020

Do we need a successive change for the change in API endpoint? @InoMurko @unnawut @T-Dnzt

@T-Dnzt @achiurizo I think we should get a thumbs up from either of you for any API breaking change.

It's the watcher & watcher_info /configuration.get's exit_processor_sla_margin -> exit_processor_sla_seconds. I don't think there's a workaround e.g. keeping the old key-value, because the value would no longer applicable.

So I guess either:

  1. Scrap this proposal. Downside is we have to stick with sla_margin which is different from the contract's sla_seconds
  2. A 👍 from @T-Dnzt @achiurizo and the release manager making sure the breaking change is in the release comms.

I recommend 2 for consistency with the contracts outweighs dropping this key from /configuration.get

Copy link

@T-Dnzt T-Dnzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upgrade path required Need to inform infra and users of upgrade path, e.g. running an extra command or config change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants