Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Versioning and Security related changes #78

Closed
majormoses opened this issue Nov 20, 2017 · 12 comments · Fixed by #87
Closed

[RFC] Versioning and Security related changes #78

majormoses opened this issue Nov 20, 2017 · 12 comments · Fixed by #87

Comments

@majormoses
Copy link
Member

majormoses commented Nov 20, 2017

We follow http://semver.org/spec/v2.0.0.html guidelines very strictly and feel like that covers 99.9999% of changes. What I feel like it does not address are security fixes. Security is an interesting one since you want the user to get the fix out as soon as possible. In the face of Pessimistic version constraints we want to inject it into the update cycle as soon as possible. I have a few thoughts and am not yet sure where I stand.

Initial Thoughts that security incidents should either be a major when a breaking change or patch when backwards compatible. The rationale is that we never want to break a production environment no matter what as we do not know how critical these systems being monitored are (anything from your kegerator to life saving devices in a hospital). Any time it is non breaking we want to inject it as early as possible to make sure that if someone versions on ~> x.y.z which is the most strict other than exact version pinning and says anything that does not add new features but fixes stuff should be pulled in. In my opinion security changes are always a bug fix unless it is adding a new security feature.

One argument I find myself making is that even breaking changes should possibly be considered on a case by case basis if a breaking change has very low impact. My argument comes from the fact that we might be exposing you to greater risk by not forcing you to update. I am looking at you Windows XP! I am arguing that it should be evaluated for:

  • Actual impact
  • Security risk
  • The cases that break. If say it is removing ruby 2.0 from the supported versions which never shipped as an embedded ruby version then we would still call it out as a breaking change but version it as a patch. This is just one example but the point being that the maintainers (should this require a code owner sign off?) should evaluate and make the best decision in an impossible scenario to make everyone happy.

The counter argument is that if you break even .0001% of the semver spec then you can no longer trust it.

I am very conflicted on this and am looking for others thoughts on this matter.

@mattyjones
Copy link
Member

@mbbroberg What are you thinking?

@majormoses majormoses self-assigned this Nov 21, 2017
@mbbroberg
Copy link
Contributor

@mattyjones I don't have a strong opinion just yet. Tagged you for your perspective - I trust your take on security response.

@jaredledvina
Copy link
Member

Initial Thoughts that security incidents should either be a major when a breaking change or patch when backwards compatible.

I agree with this completely. Even if the change is security related, I'm more hesitant to deploy the update if it's going to break anything.

One argument I find myself making is that even breaking changes should possibly be considered on a case by case basis if a breaking change has very low impact. My argument comes from the fact that we might be exposing you to greater risk by not forcing you to update.

At the end of the day though, we aren't actually forcing anyone to upgrade. From what I've seen many production environment's are going to pin the version of the sensu plugins they have installed such that they have to explicitly bump the version to get the fix.

The cases that break. If say it is removing ruby 2.0 from the supported versions which never shipped as an embedded ruby version then we would still call it out as a breaking change but version it as a patch.

I'd say that we should follow SemVer's point #7 here:

Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated.

My logic being, that we are deprecating the support for Ruby 2.0 but, not immediately breaking it. I could see the argument that because Ruby 2.0 wasn't bundled, it's not really a 'public API'. So..... yeah, I see how this is tricky. Thinking this through some more, folks also aren't required to use the embedded Ruby either. I think that because it's possible people are using the plugins with Ruby 2.0, a minor version bump is the way to go here.

Personally, I think updating https://github.com/sensu-plugins/community/blob/master/HOW_WE_CHANGELOG.md to include some notice for security fixes explicitly would be nice ( like what was done in https://github.com/sensu-plugins/sensu-plugins-ansible/pull/6/files#diff-4ac32a78649ca5bdd8e0ba38b7006a1e ). As a user, I prioritize updating things that have security fixes, so having more visibility there I think helps. I'd say that we continue sticking with SemVer as strictly as we can and evaluate all security changes against it while making it explicitly clear on releases that contain security fixes.

Any-who, that's my take of it, curious how that jives with everyone's perspective.

@mbbroberg
Copy link
Contributor

Love the idea of addressing this in HOW_WE_CHANGELOG.md based on the discussion. Good target for standardizing the process and we can roll out from there!

@majormoses
Copy link
Member Author

Ya the goal of this RFC was to either create a new document on versioning or update a section in our changelog guidelines.

At the end of the day though, we aren't actually forcing anyone to upgrade. From what I've seen many production environment's are going to pin the version of the sensu plugins they have installed such that they have to explicitly bump the version to get the fix.

Yup I preach this constantly yet a fair number of people tend to not follow it. I pin on exact versions which makes it really impossible to pull in new versions until I say so. The only way this will break is either updating a dependency, environment, or yanking the gem from rubygems.org. We periodically pull in new changes after looking at each plugins changelog to understand the impact bringing it in.

The cases that break. If say it is removing ruby 2.0 from the supported versions which never shipped as an embedded ruby version then we would still call it out as a breaking change but version it as a patch.

Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated.

Well a security update (not adding a new security feature but fixing existing features that are security related) would never be a minor as there is no new feature added. They would either be major in the case of breaking and patch otherwise. We do not always create a deprecation notice in a minor and then update it again for the breaking change, in an ideal world we would but we typically only do this for libraries not individual check/metric scripts or handlers. I think that would be especially true with a security update as if it is backwards compatible we want to allow it to be pulled in with pessimistic version constraints on the patch version. When in doubt if a change is breaking I have always favored major version bumps and updating the changelog to reflect potentially breaking changes.

My logic being, that we are deprecating the support for Ruby 2.0 but, not immediately breaking it.

well by removing it from the gemspec we are actually breaking support for it immediately (as soon as it is released) as at install time it will fail if the gemspecs validation with the current ruby enviornment.

I could see the argument that because Ruby 2.0 wasn't bundled, it's not really a 'public API'. So..... yeah, I see how this is tricky. Thinking this through some more, folks also aren't required to use the embedded Ruby either. I think that because it's possible people are using the plugins with Ruby 2.0

Ya the line of public api here are a bit fuzzy, I think that given the fact that it is not recommended to use ruby versions outside of the the embedded ruby is the indicator that these versions are officially

supported versions as the public api until they are considered EOL per our policy outlined here. Thinking futurewise there will eventually not be any bundled ruby in the next generation of sensu so I am kinda torn if this demarcation will still make sense but I'd hate to make us unable to come up with a process/policy now because of a bunch of unknowns.

https://github.com/sensu-plugins/documentation/blob/master/docs/FAQ.md#what-is-the-policy-on-supporting-end-of-lifeeol-ruby-versions

Our guidelines are an expansion on the guidelines from keep a changelog which includes a ### Security header which is why it is not called out explicitly. Should we add to our guidelines all the types of change? The original goal was a very think wrapper over the guidelines they have tweaked to our communities needs but maybe we should flesh it out more.

I think unless anyone else feels strongly about it I think we will create somewhere a document on this detailing the stance regarding versioning on these issues. Here is a high level overview of what I think we need to document:

  • security fixes follow the principle of major or patch like any change. If it breaks even 0.0001% then it needs to be versioned as a major if it is a version of sensu that is bundled with sensu (will likely change in the future). If it is non breaking it is a patch.
  • potentially update the how we changelog document to have all of the headers we use not just the ones that we added to the keep a changelog guidelines.

@huynt1979
Copy link

@majormoses This sounds good to me as it's still sticking with SemVer principles. 👍 for adding more documentation. We've seen how lacking it hurts...

@majormoses
Copy link
Member Author

After thinking about this again and in light of the OSX vulnerability and their failure to patch it right again. I have decided that we need to still have an out clause and process for approval. I have a PR that I am about to submit that addresses this wholistically but decided I should include the exception clause here. Any issues should be raised against the PR and not the issue at this point.

Process for exception

With all rules there will always need to be an exception. This talks about the types of exceptions and what it takes to get approval to knowingly violate semver for the customers greater good. If they want to bypass this then specify exact versions and avoid any kind of pessimistic version constraints and host your own rubygem repository/server.

In the face of something VERY SERIOUS such as credential leakage, privilege escalation or backdoors such as this. We should make such considerations with a heavy heart and as infrequently as possible. This require buy in from at least 2 maintainers one of them must be an org wide maintainer. In this case you would version as a patch but include the ### Breaking Change section explaining what use cases might be broken. Along these lines yanking gem versions in the name of security need the same kind of approval and scrutiny.

@mattyjones
Copy link
Member

mattyjones commented Dec 5, 2017

I am along the lines of nearly everything @majormoses is offering.

TL;DR

Well a security update (not adding a new security feature but fixing existing features that are security related) would never be a minor as there is no new feature added.

For the most part, as done from the outset, if we document it, then it is up to the user how they want to handle it. I am a strong proponent of enforcing semver to the minutia. I do agree with the exception clause in the above pr though. I would think, I lost count who is atm, any org-wide maintainer would have enough vetting and experience to understand what is at stake.

For some history, when I started actively working on this, I would ping Sean on anything that could affect the core sensu-plugin gem, or anything org-wide for a second opinion. When he and I first talked about breaking things up, that was the common ground we agreed on. He trusted me, for some unknown reason ;), and I documented everything and worked as openly as possible. The community is certainly big enough now where I think any core?? member should be given the power and responsibility to auth semver exceptions in the cases outlined.

@majormoses
Copy link
Member Author

majormoses commented Dec 5, 2017

I am a strong proponent of enforcing semver to the minutia

Yup I completely agree, most of the time when people complain about semver not working it is because the developers/maintainers do not use it properly. The only reason I brought it up is I do think there are extreme cases where it should not be followed to best serve the users in a horrifying situation. I would like to clearly acknowledge this as an unfortunate reality of software and outline a process to help guide us and prevent poorly made knee-jerk reactions.

I would think, I lost count who is atm, any org-wide maintainer would have enough vetting and experience to understand what is at stake.

Is this still a concern now that we have and maintain this? https://github.com/sensu-plugins/community#maintained-areas I realize that the second part of the statement is the more important one to consider. I think that if we are in such as situation another set of eyes on something with someone who has both breadth and depth should be consulted before such an exception is made.

The rationale: An org wide maintainer will definitely have the breadth to asses how it impacts other projects. When they do not understand they know it and have the maturity to loop in others with a better grasp of the situation. In the case of a maintainer being to one to propose changes they can count for one towards the minimum requirements. Ideally another maintainer should be reviewing changes and in certain cases where we have identified SME's that are not maintainers we could count them as a non org wide maintainer vote. I think this gives us a balance of ability to move relatively quickly while still putting in the necessary checks and balance.

@mattyjones
Copy link
Member

I was unaware of that page so kudo's. Again I have been away for a while and am very slowly starting to find time to dip my toes back in where I know I can.

With that said I am good with everything.

@majormoses
Copy link
Member Author

@mattyjones awesome can you review the PR?

@majormoses
Copy link
Member Author

Thanks everyone for giving me your thoughts! Any future changes will be addressed through similar RFC's, issues, and PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants