Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address issue #122 according to Inf WG discussion #126

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

DilipSequeira
Copy link
Contributor

This also adds a definition of a Reproducible software component so that RDI does not become a venue for submissions that are not reproducible and can never be reproducible.

This also adds a definition of a Reproducible software component so that RDI does not become a venue for submissions that are not reproducible and can never be reproducible.
@github-actions
Copy link

github-actions bot commented Sep 30, 2022

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@arjunsuresh
Copy link
Contributor

Thank you @DilipSequeira for the change. I have a question though - when available software can be private (currently no public availability details are mandated) is it okay to ask for RDI software binary to be downloadable from a public URL? I propose this condition to be added to available category as well.

Also if we enforce a reproducible software and available hardware, we can enable audit for such systems right?

@DilipSequeira
Copy link
Contributor Author

Nvidia would like to see all software being downloadable from public URLs, but I don't think it's realistic to impose that on less forthcoming submitters. Changed it to "by anyone to whom the hardware is Available."

@arjunsuresh
Copy link
Contributor

Thank you Dilip for the change. May be Nvidia can propose "Available to public (retail)" (both hardware and software) as a subcategory to "available" as it is clearly different from "available to OEMs".

@DilipSequeira
Copy link
Contributor Author

I agree it would be good to refine the notion of Availability, but not in this PR. Maybe you could start an issue/PR for that, as a basis for working group discussion?

@arjunsuresh
Copy link
Contributor

Sure Dilip. I can do that.

@tjablin
Copy link
Collaborator

tjablin commented Oct 4, 2022

WG: Clarify rules around binary components. Otherwise, looks good.

Clarify that only components which substantially determine performance need be reproducible, and that the whole stack should be accompanied by usable instructions.
arjunsuresh
arjunsuresh previously approved these changes Oct 4, 2022
Copy link
Contributor

@arjunsuresh arjunsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rnaidu02
Copy link
Contributor

@johntran-nv @erichan1 Can you review this in Training WG?

Copy link
Contributor

@johntran-nv johntran-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training WG reviewed today, but there were some concerns. The simplest concern is from Google, who is concerned that for an RDI submission on available hardware, they don't want to have to prepare and package the complex internal SW stack that they used. This goes against their view of the intent of the RDI division, and would put a large burden on RDI SW submissions.

Alternatively, Intel proposed to allow RDI SW submissions to go straight to Available in the next round - that would mitigate the missing SW within 6 months, but doesn't quite solve the short term Reproducible aspect that Dilip is going for here.

@arjunsuresh
Copy link
Contributor

@DilipSequeira Can we do a change like below?

RDI systems using only Available hardware may use a Reproducible software stack, accompanied by instructions which would allow a reasonable expert user to download and install it on a hardware system similar to the submission system.

RDI systems may not be submitted as Available components until the submission cycle after next or 221 days whichever is longer. This restriction is not applicable to RDI systems using available hardware and a Reproducible software stack

@DilipSequeira
Copy link
Contributor Author

@arjunsuresh excellent suggestion. I've updated the PR accordingly.

@johntran-nv can training please re-review.

@arjunsuresh
Copy link
Contributor

Thank you @DilipSequeira

@psyhtest
Copy link

A binary component is Reproducible if it is downloadable by anyone to whom the hardware is Available,
from a URL which must remain valid until a release of the software is Available.

I think we need to define what is meant by "a release" here. I imagine a vendor might want to include some optimisations into an engineering build of their driver in the run-up to a submission deadline. Expecting that same engineering build to eventually become Available would be unreasonable - as it may not have gone through a rigorous testing process. Rather, a proper release incorporating the same changes should serve as well.

Let's take a concrete example. Suppose that a submission deadline happens to be on the 1st of February, and the announcement is planned for the 28th of February. Suppose a submitter uses an engineering build u.v.w to obtain RDI results by the deadline. Suppose they make a proper release u.v on the 27th of February. Questions:

  • Does u.v.w need to be available at a URL disclosed to the Review Committee from the 1st to the 27th? (I assume, yes.)
  • Can the URL be invalidated when u.v is released? (I assume, yes.)
  • Would the result tables need to be updated to refer to u.v instead of u.v.w? (I assume, maybe.)
  • Most importantly, should the submitter guarantee that the performance/accuracy with u.v is the same as with u.v.w? (I assume, yes, but how?)

@DilipSequeira
Copy link
Contributor Author

DilipSequeira commented Nov 29, 2022

Fair point. What's meant here is "an Available release containing the optimizations" - which for practical purposes means, an available release whose performance on the benchmark is at least as good as the submitted results.

So I would say the answers to your questions are:

Yes, the URL must be disclosed
Yes, the URL can be invalidated on release of the software to customers
Ideally yes, because otherwise the submission becomes unreproducible once the URL is invalidated. I'm not sure we have a mechanism for this today.
Not quite - performance and accuracy must be at least as good as the submitted results, rather than the same.

I don't think it's any part of our goal here to remove the due diligence burden on submitters to ensure that the submission is reproducible in regard to performance and accuracy - just to give submitters the option to submit bleeding edge optimizations somehow. The path that minimizes overall engineering work is to plan such that your software is at least in beta on submission day.

Edit - I'm inclined to suggest that the URL must stay up until the RDI submission is replaced with an Available submission in a subsequent round, with equal or better performance (as that phrase is used in the Preview rules.)

@arjunsuresh
Copy link
Contributor

From a submitter point of view I think the proposal can be summarized as follows:

Suppose February 28 is the submission deadline and I have a new staging software on February 21 which I would like to use for submission (provided all runs go as expected). In this case I have the following options.

  1. Try to make a beta release out of the staging build within a week (before February 28). In this case I can submit my results in the Available category.
  2. Suppose my current build is not backward compatible or is very specific to a particular model. Then I have 2 options
    1. Make the software binary available to download and then do a submission under RDI category. In this case, no 221 days rule is applicable.
    2. Do not make the software binary available and do a submission under RDI category. In this case the 221 days rule is applicable.

The only issue I see here is for the last case. I'm not sure how MLCommons can ensure that the 221 days rule is followed as the software is completely closed (though hardware is available).

@erichan1
Copy link
Contributor

erichan1 commented Dec 1, 2022

Training WG talked about it today. We generally think this is ok. Two questions.

  1. How would the reproducible tag actually show up in the table? Just "RDI-reproducible" vs. "RDI"?
  2. What does @TheKanter think? Since you think we shouldn't complicate the categories any more for results viewers.

@DilipSequeira
Copy link
Contributor Author

Does this need to show up in the results table? It could just be a field in the JSON.

@TheKanter
Copy link
Contributor

TheKanter commented Dec 1, 2022

RDI is explicitly understood and was designed to support results that are not reproducible already (e.g., enabling internal prototypes). That was it's original intent. E.g., people submitting FPGA prototypes of hardware, internal-only products (e.g., Tesla systems were mentioned).

I am strongly opposed to any additional fragmentation of results. The results presentation is problematic as is already. Results need to be simpler, not more complicated. This will make things worse.

@DilipSequeira
Copy link
Contributor Author

I suggest we regard having a reproducible stack in RDI as a compliance issue (in the sense that you need to have compiled if you're going to submit in Available the next round), and we don't typically capture those in the results table.

@arjunsuresh
Copy link
Contributor

@TheKanter @erichan1 If all the concerns are addressed can we please merge this PR? As the rule currently stands it is difficult to do any submission on an available hardware with not released software and in TinyML submissions too we are having issues with it.

@TheKanter
Copy link
Contributor

TheKanter commented May 31, 2023 via email

@arjunsuresh
Copy link
Contributor

@TheKanter Exactly. So the proposal is to submit such results under RDI category. Since the hardware is available such submissions are exempt from the 221 days rule (if hardware is unavailable an RDI submission cannot be submitted under available category for 221 days).

@nv-ananjappa
Copy link

@TheKanter This has been discussed in both Training and Inference WGs. Do you have any further open questions?

@mrasquinha-g
Copy link
Contributor

This PR was discussed in the recent chairs sync and prompted the need to review Preview and RDI rules. DavidT will be starting a proposal based on the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants