Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for multi-DUT and multi-ASIC testing support #2347

Closed
wants to merge 3 commits into from
Closed

Add proposal for multi-DUT and multi-ASIC testing support #2347

wants to merge 3 commits into from

Conversation

wangxin
Copy link
Collaborator

@wangxin wangxin commented Oct 15, 2020

Description of PR

Summary:
Fixes # (issue)

This is a documentation proposing changes to test infrastructure for multi-DUT and multi-ASIC testing support.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

SONiC is evolving to support multiple DUT and multiple ASIC systems. The test infrastructure also needs to be updated to support that new architecture.

How did you do it?

Added a document proposing changes to test infrastructure. This document is mainly based on work from Arvindsrinivasan Lakshminarasimhan and Sandeep Malhotra.

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@wangxin wangxin requested review from arlakshm, lguohan and a team October 15, 2020 09:54
@wangxin
Copy link
Collaborator Author

wangxin commented Oct 15, 2020

@sanmalho-git Could you please help review as well?

@lguohan
Copy link
Contributor

lguohan commented Oct 15, 2020

@wangxin, sonic-mgmt pr not running now?

@sanmalho-git
Copy link
Contributor

I am assuming that the 'duthost' fixture going to be updated to return 'MultiAsicSonicHost'. If so, can we add the changed code for that fixture.

Assuming 'duthost' is a 'MultiAsicSonicHost', then with the approach above, we are giving the flexibility to enhance a test for multi-asic by using the 'duthost' fixture. If the test needs to be backward compatible to both single-asic DUT and multi-asic DUT, the call in the test would be

duthost.foo()

Now for single-asic DUT the return is a dictionary. But for multi-asic DUT, this is a list of dictionaries. We now have to deal with the two different data structures depending on whether we are multi-asic or single asic DUT.

@daall
Copy link
Contributor

daall commented Oct 15, 2020

retest this please

@wangxin
Copy link
Collaborator Author

wangxin commented Oct 16, 2020

Yes, the duthost fixture will automatically be a MultiAsicSonicHost instance after the duthosts fixture is update to a list of MultiAsicSonicHost. It's because the duthost fixture is implemented as returning an item from the duthosts list according to dut_index in pytest request.session object. If dut_index is not available, always return the first item from duthosts as duthost.

The new duthost fixture can be used for both single and multi-ASIC DUT. For multi-ASIC DUT, for example, if we call

duthost.bgp_facts()

Without asic_index argument, the call will be delegated to SonicHost. Eventually, self.sonichost.bgp_facts() will be called. self.sonichost is an attribute of MultiAsicSonicHost. It refers to instance of the SonicHost class. The returned result will be just calling bgp_facts module without instance_id argument. It will be a dictionary same as calling bgp_facts without instance_id on single-ASIC DUT.

If we call

duthost.bgp_facts(asic_index=1)

With asic_index=1, method bgp_facts of SonicAsic class will be called for object self.asics[1]. self.asics is an attribute of MultiAsicSonicHost. It's a list of SonicAsic instances. The method bgp_facts of SonicAsic will eventually call self.sonichost.bgp_facts(instance_id=1). Here the original asic_index=1 is translated to instance_id=1 in SonicAsic.bgp_facts method. This self.sonichost is an attribute of SonicAsic. It is a reference to instance of the SonicHost class. The returned result will be calling bgp_facts module with instance_id=1 argument. It will be a dictionary same as calling bgp_facts with instance_id=1 on single-ASIC DUT.

If we call

duthost.bgp_facts(asic_index="all")

With asic_index="all", the bgp_facts method of object in self.asics list will be called. The returned result will be a list.

So, the key design here is that if asic_index argument is "all", then call the module for all ASICs and return the result in a list.
Without asic_index, the module is just called for the host (or global/default namespace).

@sanmalho-git
Copy link
Contributor

Let's assume we have this call in the test case to ansible module 'foo' that is multi-asic aware.

duthost.foo(asic_index='all')

The key is that the same call should work for a single-asic, and multi-asic duthost.

If duthost represents a single asic pizza box, then this will eventually hit MultiAsicSonicHost._run_on_asics. With this, we will hit SonicAsic.foo (which has self.asic_index=0) with asic_index='all'. So, asic_index=0 is passed into 'foo' ansible module. At this time, the ansible module doesn't have the knowledge whether this asic_index 0 is to represent the 'global' namespace, or actually 'asic0'.

When prototyping this for bgp_facts ansible module, I see it trying to get bgp_facts on 'bgp0' instance - which doesn't exist for single asic pizza box, and thus we fail.

One way to solve this it to add a check for num_asic in the asic_index='all' case in MultiAsicSonicHost._run_on_asics method. If num_asic == 1, then call the module on self.sonichost, else call on SonicAsic:

    def _run_on_asics(self, *module_args, **complex_args):
        if "asic_index" not in complex_args:
            # Default ASIC/namespace
            return getattr(self.sonichost, self.attr)(*module_args, **complex_args)
        else:
            asic_index = complex_args.pop("asic_index")
            if type(asic_index) == int:
                # Specific ASIC/namespace
                return getattr(self.asics[asic_index], self.attr)(*module_args, **complex_args)
            elif type(asic_index) == str and asic_index.lower() == "all":
                # All ASICs/namespaces
                if self.sonichost.facts['num_asic'] == 1:
                    return [getattr(self.sonichost, self.attr)(*module_args, **complex_args)]
                return [getattr(asic, self.attr)(*module_args, **complex_args) for asic in self.asics]
            else:
                raise ValueError("Argument 'asic_index' must be an int or string 'all'.")

@wangxin
Copy link
Collaborator Author

wangxin commented Oct 19, 2020

@sanmalho-git Good catch! Thank you! I'll update the document.

* all ASICs
* specific ASIC
* combinations of the above scenarios

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great if you have have some diagram to show the relation between all classes you plan to add to represent the multi-asic, multi-dut.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll update.

@@ -0,0 +1,499 @@
# Improve the test infrastructure to support multi-DUT and multi-ASIC systems
Copy link
Contributor

@lguohan lguohan Oct 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems only addressing pytest infrastructure. I think somewhere we need to add a doc to describe the sonic testbed infrastructure support for that. for example, ansible deployment script, topology file.

How do we represent the multi-dut topology? minigraph generation? maybe this is not covered by this doc. do we have doc to cover that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'll need more documents to cover these areas. I will be working on them.

duthosts.supervisor_nodes[0].bar()
duthosts.supervisor_nodes["supervisor0"].bar()
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each scenario should there be a list suggested APIs to use? If everyone uses same API it's easier to understand the code. Also becomes easier to refactor if there's such a need later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! This document proposed various ways of using APIs. For some scenarios, multiple APIs can do the same thing. It is a little bit hard to finalize such a list for each scenario before we have some real code. But sure, we'll try to improve. It would be easier if there is always single best way to do something.

```

In the above examples:
* `duthost.asics` should be a list of objects. Each object represents an ASIC in the DUT. `duthost.asics` represents all the ASIC instances in the DUT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are applicable only for the frontend asics, and there could be some test cases that apply only for backend. We should also have:

duthost.frontend_asics
duthost.foo(asic_index="frontend_asics")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can enhance it this way.

[node.foo() for node in duthosts.nodes]
```
For multi-ASIC related operations on DUT host, the function or ansible module call should accept optional `asic_index` argument.
* No `asic_index`, perform operation just on the host (or default namespace). For example, just run command `"show ip bgp summary"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit misleading. It really depends on the CLIs. In this example "show ip bgp summary" CLI walks through all the ASIC namespaces and does not return the 'bgp summary' from default namespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll update it to make it more clear.

@yxieca
Copy link
Collaborator

yxieca commented Sep 9, 2021

@sanmalho-git @wangxin @arlakshm what do we need to do to get this PR either merged or closed?

@wangxin
Copy link
Collaborator Author

wangxin commented Dec 1, 2021

Some documentations for multi-ASIC and multi-DUT has been added. I'll integrate some contents of this PR to the existing documents. Closing this PR for now.

@wangxin wangxin closed this Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants