Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add Etag support for file.managed web sources #61391

Merged
merged 9 commits into from
Jan 10, 2022

Conversation

nicholasmhughes
Copy link
Collaborator

@nicholasmhughes nicholasmhughes commented Dec 23, 2021

What does this PR do?

This PR introduces the ability to use the Etag response header to determine if a file hosted by a remote web server has changed since the last time it was retrieved.

What issues does this PR fix or reference?

Fixes: #61270

Previous Behavior

Near as I can tell, remote web sources (http/https) in file.managed are currently handled one of two ways:

  1. The source_hash is provided for a source file. If the hash changes, then the file is downloaded again. If the hash is the same and the file is already cached, then the cached file is used for comparison.
  2. The skip_verify flag is set, either because the source location doesn't host a hash file, it's not in the "correct" format to be read by Salt, or statically setting the source hash in the SLS file isn't pragmatic. When this flag is set, the remote file is only downloaded if it's not in the cache. As long as it's in the cache, the remote source is not checked again. Removing the cached file is the only way to trigger retrieval of an updated remote file.

New Behavior

This PR introduces a third option for handling remote web sources:

  1. With the use_etag option, a quick check with the remote web server is always performed. The last Etag response header value is stored with the cached file, and is provided as the value of the If-None-Match request header with every "check-in". If the stored Etag value matches the value calculated by the web server, the remote file is not retrieved. A 304 HTTP status code (Not Modified) is sent back to the minion instead. This allows for a "middle ground" where new file versions can be retrieved as soon as they're available on the web server without needing to provide a hash for comparison.

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

Commits signed with GPG?

Yes

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

@nicholasmhughes nicholasmhughes requested a review from a team as a code owner December 23, 2021 14:43
@nicholasmhughes nicholasmhughes requested review from twangboy and removed request for a team December 23, 2021 14:43
@nicholasmhughes
Copy link
Collaborator Author

Interested in comments on approach here. The file.managed state module offloads to two different file execution module functions, which in turn use the cp execution module, which in turn uses a file client function, which in turn uses the http utils module. This approach requires the least amount overhaul of existing logic, but I'm not sure if it introduces any "weirdness" not accounted for.

Also wondering if @waynew has any tips or tricks for testing this path before I dig into writing the tests.

Copy link
Contributor

@waynew waynew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise absolutely loving the idea here - a couple of things to look at, but overall 👍

Since I've been in the caching code recently I have some questions here about writing out the .etag file vs doing something else (though with the limited time I've looked at this, I don't know what that is yet).

I'll take a look during tomorrow's Test Clinic about testing - if you're not able to make it definitely check the recording afterwards!

salt/fileclient.py Outdated Show resolved Hide resolved
salt/modules/file.py Outdated Show resolved Hide resolved
salt/modules/file.py Outdated Show resolved Hide resolved
@nicholasmhughes
Copy link
Collaborator Author

re-run pr-freebsd-130-amd64-py3-pytest

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-windows-2016-x64-py3-pytest

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-ubuntu-2004-amd64-py3-m2crypto-pytest

@nicholasmhughes nicholasmhughes changed the title [WIP] Add Etag support for file.managed web sources [FEATURE] Add Etag support for file.managed web sources Jan 10, 2022
@github-actions
Copy link

Hi! I'm your friendly PR bot!

You might be wondering what I'm doing commenting here on your PR.

Yes, as a matter of fact, I am...

I'm just here to help us improve the documentation. I can't respond to
questions or anything, but what I can do, I do well!

Okay... so what do you do?

I detect modules that are missing docstrings or "CLI Example" on existing docstrings!
When I was created we had a lot of these. The documentation for these
modules need some love and attention to make Salt better for our users.

So what does that have to do with my PR?

I noticed that in this PR there are some files changed that have some of these
issues. So I'm leaving this comment to let you know your options.

Okay, what are they?

Well, my favorite, is that since you were making changes here I'm hoping that
you would be the most familiar with this module and be able to add some other
examples or fix any of the reported issues.

If I can, then what?

Well, you can either add them to this PR or add them to another PR. Either way is fine!

Well... what if I can't, or don't want to?

That's also fine! We appreciate all contributions to the Salt Project. If you
can't add those other examples, either because you're too busy, or unfamiliar,
or you just aren't interested, we still appreciate the contributions that
you've made already.

Whatever approach you decide to take, just drop a comment here letting us know!

Detected Issues (click me)
Check Known Missing Docstrings...........................................Failed
- hook id: invoke
- duration: 1.34s
- exit code: 1

The function 'recv' on 'salt/modules/cp.py' does not have a 'CLI Example:' in it's docstring
The function 'recv_chunked' on 'salt/modules/cp.py' does not have a 'CLI Example:' in it's docstring
The function 'envs' on 'salt/modules/cp.py' does not have a 'CLI Example:' in it's docstring
Found 3 errors


Thanks again!

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-centos-7-x86_64-py3-m2crypto-pytest

Copy link
Contributor

@waynew waynew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Love the test - makes it very clear how use_etag is expected to work!

@waynew waynew added this to the Phosphorus v3005.0 milestone Jan 10, 2022
@Ch3LL Ch3LL merged commit 3b71ef4 into saltstack:master Jan 10, 2022
@OrangeDog
Copy link
Contributor

Does this account for header case?

The standard is ETag but I'd expect it to be normalised to etag, not Etag.

@nicholasmhughes nicholasmhughes deleted the add-fileclient-etag-support branch January 11, 2022 00:43
@nicholasmhughes
Copy link
Collaborator Author

apache2 and http.server use Etag. just stood up nginx and it seems that uses ETag. nice. PR inbound.

@nicholasmhughes
Copy link
Collaborator Author

#61441

@damon-atkins
Copy link
Contributor

Hi @nicholasmhughes thanks for doing this. A better method would be to write the whole header to a different file instead of using a file specific for etag. There are many ways to determine if a file should be cached from a web server. Keeping the whole header in a filename.meta file allows not only etag to be used but other techniques, like if-not-modified-since using the date from save header which is in the same format as what the web server provided. Instead of use_etag just have http_cache. e.g.

with salt.utils.files.fopen(meta_file, 'r', encoding='utf-8') as meta_fh:
   cache_headers = requests.structures.CaseInsensitiveDict(data=salt.utils.json.load(meta_fh))

Not sure why salt does not use the request library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Use ETag in Salt Fileclient for HTTP sources
5 participants