-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: cmd/go: add .proxy endpoint to the module proxy spec #35400
Comments
The Expires headers could/should(?) say the same thing. e.g. Expires 1 week vs Expires 100 years. |
In many cases there is a difference between when the response is considered "stale", ie. Expires, and when the underlying server may be unable to continue serving a zip. The common case would be if a proxy is using a CDN where the cached response provided by the CDN may be stale after a few hours but the proxy server intends to continue serving that zip for much longer. |
Having module versions cached temporarily means non-deterministic builds for the users. IMO availability is one of the fundamental requirements of a public goproxy. And this is true atleast in gocenter.io. Also, what does it mean for a user who also uses a local goproxy that further points to a public goproxy? Should they selectively clean up the local goproxy's cache always given this new endpoint? Maybe let users decide what they want to consume based on the metadata provided by the goproxy. |
+1 to the above. Except for extenuating circumstances like DMCA takedowns etc..., why would a module not be stored (purposefully not using the term "cache" because it implies expiration) forever on proxy.golang.org? If there were modules that proxies/mirrors might not or did not store, then as @ankushchadha said, builds become nondeterministic. One of the major apparent benefits of proxy.golang.org right now is that it enables deterministic builds. Edit: the proxy enables deterministic builds |
@arschles, a module might not be stored if the proxy maintainer is not confident that the module's license permits it to be stored. Builds in that case do not become “nondeterministic”: they may either succeed or fail, depending on whether the needed modules are available (locally or from any configured remote source), but if they succeed they will produce the same result as any other successful build. |
@bcmills understood, I agree that this feature may be useful for on-prem proxies. I'm talking about this endpoint in the context of public, hosted proxies. It introduces the possibility that a host may cache modules, and if you get a |
I agree we need a way to tell users what proxy.golang.org will do about a specific module version. But I am not 100% sure about whether this endpoint belongs to the proxy protocol - at this moment, it seems too specific to proxy.golang.org. It will sound more convincing if there are proxies other than proxy.golang.org that would utilize this new endpoint in a meaningful way. The endpoint doesn't make much sense for enterprise and private proxies. gocenter.io is trying to mirror everything once it decides to serve a module version. Most of other public proxies I've seen didn't make any official commitment about their data retention policy. Can other public proxy owners chime in? |
goproxy.io is here, but as @bcmills said, we are not confident that the module's license permits it to be stored, and space is always limited. |
Since we can't do this (re: licensing), the best alternative would be to inform users if there is a genuine risk that their dependency will disappear if it's removed from the origin server. You mentioned that this endpoint would say "that module@version could expire at any time", so one alternative might be to give a timestamp for how much longer this cached copy will live, instead of true/false?
Can you clarify? Are you saying that goproxy.io also doesn't mirror things forever, depending on the size of the module and the license? If that's the case, then users of your service may also benefit from this kind of transparency. Thanks for everyone's comments. As @hyangah said, it's going to be difficult to justify this if it's not something other proxies would benefit from, and if that's the case, this may just be something that proxy.golang.org should do itself if users are asking for it. |
@oiooj @katiehockman If gocenter.io wants to preserve rights to evict some of the module versions to alleviate storage usage pressure in the future, I expect gocenter.io to return 'false' for the proposed /mirrored endpoint for all module versions. Then I don't think this endpoint is very useful for its users either. |
@hyangah, to the contrary! If some tool uses |
@bcmills Shouldn't the user of the proxy already know about the promise of the public proxy they are using? As far as I know, proxy.golang.org is the only one that may have different answers for modules/versions. BTW if we are talking about the users who want to distribute the source code of binaries/libraries and control the dependencies, they don't know what proxy "their users" will depend on to build their source code. In this case, will they still need to vendor, or instruct their users to always use specific proxies they verified all their dependencies are mirrorred in? |
What's the reason for using /mirrored instead of a field in the info file? |
There is no intrinsic reason why that must be the case, and having an endpoint would make it easier for users to detect if (say) the proxy that they are using changes its policy to provide longer-term mirroring.
Their users (transitively) can use the same endpoint to decide what to do. |
Proxies may reasonably re-serve |
We've treated the I've also always viewed this file as "metadata about the module version" which is proxy independent, rather than "metadata about the module version as it relates to the proxy you got it from" which could change. @jayconrod @heschik and I were discussing this the other day. |
The existing convention in these URLs is to disambiguate based on the file extension not a new path element, so it would be v0.3.2.mirrored not v0.3.2/mirrored. Beyond that, though, I wonder if maybe there will be need to send more than a single bit at some point (thus my question about .info). If we don't extend .info then it seems like we should instead define a new JSON-formatted .proxy file for proxy-specific information about the given module. It could start with just one field (Expires?) and add more as needed. |
Agreed that there is very likely to be a time where more than one bit should be provided. Perhaps the date of expiration, or the detected licenses, for example. I like the idea of a generalized |
It sounds like there is general agreement to add a .proxy file with JSON. @katiehockman, would you rather:
Your call. Thanks. |
Thanks. Let's go with option 1 for now. I'll go ahead and work on exposing a |
Putting this on hold. Katie, feel free to remove the hold label when you are ready for more discussion. |
Users would benefit from more transparency around whether or not a specific module version is being temporarily cached in a proxy or whether it is being permanently mirrored. There are a number of reasons why a proxy may choose not to mirror something forever: licensing is one notable example.
The proposal would be to add an additional optional endpoint to the proxy spec (ie.
go help goproxy
), which proxies could implement if they choose to, which would give this information. For example:https://proxy.golang.org/golang.org/x/text/@v/v0.3.2/mirrored
would return "true" or "false" as plaintext.
This is something we could pair with a utility in x/mod which would accept a go.sum file and indicate which versions aren't being permanently mirrored by any of the proxies listed in
GOPROXY
. That might help you decide to use a different version of the module, vendor that dependency, or encourage you to file an issue against the module if you see that a suitable license is missing, for example./cc @jayconrod @bcmills @heschik @hyangah @rsc
The text was updated successfully, but these errors were encountered: