Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add rfc 007 static nonprod platform environment #159

Open
wants to merge 142 commits into
base: main
Choose a base branch
from

Conversation

barkerl
Copy link
Contributor

@barkerl barkerl commented Jul 3, 2024

Description

Add additional static non-prod platform environment to existing deployment

Related issue: JIRA_TICKET_NUMBER

Before submitting (or marking as "ready for review")

  • Does the pull request title follow the conventional commit specification?
  • Have you performed a self-review of the code
  • Have you have added tests that prove the fix or feature is effective and working
  • Did you make sure to update any documentation relating to this change?

cmarstondvsa and others added 30 commits June 17, 2024 15:06
@barkerl barkerl changed the title docs: Add rfc 007 static nonprod platform environment docs: add rfc 007 static nonprod platform environment Jul 3, 2024

This comment was marked as off-topic.

This comment was marked as off-topic.

docs/rfc/rfc-007-static-nonprod-platform-environment.md Outdated Show resolved Hide resolved
docs/rfc/rfc-007-static-nonprod-platform-environment.md Outdated Show resolved Hide resolved
Copy link
Contributor

@JoshuaLicense JoshuaLicense Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think the theory behind this is logical and makes sense. I believe there are more appropriate alternatives to solve this problem without adding what I believe is complexity to the code and development process.

The dev environment is a playground for the team, the team includes developers and platform engineers.

Any breaking change that can't be fixed within a certain amount of time (which could vary, but we're the same team so can be raised) should be reverted, git and GitHub give us this functionality with a command/click. Once reverted, the environment will be back working.

Feedback:

  • The proposed environment is not typical for a continuous deployment strategy. Not that this is a particular problem, but we need to add documentation for its proposed use case.
  • Large platform changes are infrequent and would mean the environment is sitting around and not being utilised. While the hosting cost may be low, this is still an environment that needs to be maintained. An extra environment adds ~25% extra maintenance (4 -> 5 environments).
  • I believe the ephemeral environment aligns with this use case more appropriately. The plan is to add functionality to deploy branches, wouldn't add extra maintenance and would be even lower cost.
  • The only platform changes this will be appropriate for are changes just in the vol-terraform repository, as the vol-app will always be applying main to this environment and using main app version. This increases the cognitive load on a platform engineer while making changes in VOL as the test process will change depending on the nature of the change.
    • To test changes in vol-app the team would need to test in dev.
    • To test changes in vol-app and vol-terraform the team would need to test in dev.
    • To test changes in vol-terraform the team would need to test in the newly proposed environment.

All said, I'm not a platform engineer so if the above doesn't persuade then I'll not stand in the way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thoughts on this are that there is not foolproof pattern for continuous delivery and i think we have to understand what works for the service (otherwise there would be one pattern and tool set that everyone would use and that certainly isn't the case). The main platform changes going forward will be in the vol-terraform repo, as per rfc-005. To test changes to vol-terraform there will need to be a pipeline for that anyway as it isn't covered in the existing ones.

the test changes aren't quite correct, only if a change was made purely in the app component would we test in Dev. Anything else we would need to test in the new environment and then in dev/ephemeral.

Fundamental changes to the foundational elements of the platform (networks etc) that are changed semi regularly are potentially across all environments. This isn't about the effort but rather what these resources support and that is the important factor here. This is why the current ephemeral environment design may not have that use case, i would rather not have a pattern of applying and potentially breaking small well defined changes with the containers proving the case for a higher release cadence.

If we find that with the rationalisation that is continuing in VOL we are in a place where some of the concerns are not longer valid I will be the first to support tearing it down.

Copy link
Contributor

@JoshuaLicense JoshuaLicense Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test changes aren't quite correct, only if a change was made purely in the app component would we test in Dev. Anything else we would need to test in the new environment and then in dev/ephemeral.

If you make a change in the vol-app and vol-terraform, for the Terraform or app changes in vol-app to take effect (applied) they will need to be merged and in dev first for it to make it to the new environment. Without merging changes in vol-app they will not be deployed anywhere.

If we find that with the rationalisation that is continuing in VOL we are in a place where some of the concerns are not longer valid I will be the first to support tearing it down.

As nothing is in place yet this is anticipating a theoretical problem in a theoretical solution, would a better way be to wait until we try the existing planned solutions (dev/ephemeral) to identify a practical problem and then iterate upon the solution we have? Rather than fixing the anticipated problem, and then tearing it down once (and if) it's proved to not be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As nothing is in place yet this is anticipating a theoretical problem in a theoretical solution, would a better way be to wait until we try the existing planned solutions (dev/ephemeral) to identify a practical problem and then iterate upon the solution we have? Rather than fixing the anticipated problem, and then tearing it down once (and if) it's proved to not be a problem?<

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants