Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MirrorOrch]: Mirror Session Retention across Warm Reboot #1054

Merged
merged 2 commits into from
Sep 20, 2019

Conversation

stcheng
Copy link
Contributor

@stcheng stcheng commented Sep 11, 2019

After warm reboot, it is expected that the monitor port of
the mirror session is retained - no changing on the monitor
port withint the ECMP group members and the LAG members. This
is due to the general of the sairedis comparison logic and
the minimalization of SAI function calls during reconciliation.

Changes:

  1. Add bake() and postBake() functions in MirrorOrch
    bake() function retrieves the state database information
    and get the VLAN + monitor port information.
    postBake() function leverages the information and recovers
    the active mirror sessions the same as before warm reboot.
  2. state database format change
    Instead of storing the object ID of the monitor port, store
    the alias of the monitor port.
    Instead of storing true/false of VLAN header, store the VLAN
    ID.

Signed-off-by: Shu0T1an ChenG [email protected]

orchagent/orchdaemon.cpp Outdated Show resolved Hide resolved
orchagent/mirrororch.h Outdated Show resolved Hide resolved
orchagent/mirrororch.cpp Outdated Show resolved Hide resolved
After warm reboot, it is expected that the monitor port of
the mirror session is retained - no changing on the monitor
port withint the ECMP group members and the LAG members. This
is due to the general of the sairedis comparison logic and
the minimalization of SAI function calls during reconciliation.

Changes:
1. Add bake() and postBake() functions in MirrorOrch
   bake() function retrieves the state database information
   and get the VLAN + monitor port information.
   postBake() function leverages the information and recovers
   the active mirror sessions the same as before warm reboot.
2. state database format change
   Instead of storing the object ID of the monitor port, store
   the alias of the monitor port.
   Instead of storing true/false of VLAN header, store the VLAN
   ID.

Signed-off-by: Shu0T1an ChenG <[email protected]>
session.nexthopInfo.nexthop = neighbor_entry.ip_address;
}
// If the port belongs to a LAG
if (port.m_lag_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lag port can also belong to a vlan, in this case, you will set the nexthop twice, once in vlan, once in lag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, the whole MirrorOrch does not support LAG in VLAN scenario. In order to have it supported, a separate pull request will be desired.

return Orch::bake();
}

bool MirrorOrch::postBake()
Copy link
Contributor

@lguohan lguohan Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current algorithm has a flaw. for example, let's say previous orchagent has a bug which cacluates a wrong session results (wrong port), in this algorithm, the warm reboot will give the same wrong port, and if the route is not update later, the wrong port will not be corrected.

the correct algorithm will need to check if the recorded results is in the newly calculated results or not, if it is in then use it, if it is not, the warm reboot need to correct that error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found current approach is difficult to achieve above results.

the best approach is the move mirror orch after route orch in the dotask loop.

Then in the mirror orch dotask loop, caculate the new mirror session results, check if the old results is contained in the new results, if yes, then use old results. It should also results simpler code since the postBake() introduce new logics which can be avoided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will leads back to the original issue that we are facing - if there are more than 1 mirror sessions require destination update, comparison logic will not be able to handle such cases, and the processes will crash. If it is expected to fix the wrong session results during the warm reboot, such logic shall be introduced after the comparison logic.

orchagent/mirrororch.cpp Outdated Show resolved Hide resolved
@stcheng
Copy link
Contributor Author

stcheng commented Sep 18, 2019

@lguohan please check the update, thanks!

orchagent/mirrororch.cpp Outdated Show resolved Hide resolved
orchagent/mirrororch.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@lguohan
Copy link
Contributor

lguohan commented Sep 19, 2019

I think we need vs test for this feature. can you later add vs test?

@stcheng
Copy link
Contributor Author

stcheng commented Sep 19, 2019

sure; i'll send a separate pull request to enhance the test

@stcheng
Copy link
Contributor Author

stcheng commented Sep 19, 2019

retest this please

With this update, we could fix potential orchagent issues before
the warm reboot when the monitor port was wrongly calculated.

Signed-off-by: Shu0T1an ChenG <[email protected]>
@stcheng
Copy link
Contributor Author

stcheng commented Sep 20, 2019

retest this please

@stcheng stcheng merged commit d823dd1 into sonic-net:master Sep 20, 2019
@stcheng stcheng deleted the mirror_retention branch September 20, 2019 23:31
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
Changes:
-- Display ipv4 address with left adjust of 25 width and with space before UDP.
-- IPv6 address will be displayed as is.
-- Add test data to appl_db.json and config_db.json
-- add test file sflow_test.py
-- use pass_db decorator for sflow_interface.
-- since sflow needs ctx, create use Db() in function.

Signed-off-by: Praveen Chaudhary [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants