-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add go support #871
Add go support #871
Conversation
I'm trying to get some insight into what the CD ids should look like. Background: I'm trying to map lines out of a AFAICT from the code, discussion, and Google Doc, the id is structured as: type: go I've come up with the following example mappings: github.com/spf13/cobra v0.0.5 h1:f0B+LkLX6DtmRH1isoNA9VTtNUK9K8xYd28JNNfOv/s= golang.org/x/tools v0.0.0-20180221164845-07fd8470d635/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= These take a slightly different form: google.golang.org/genproto v0.0.0-20190418145605-e7d98fc518a7/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw= I'm not sure what to do with this (there's no path or module; what do we do when the URI is only one segment?): go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU= In case it's at all interesting, I'm teasing the lines in the
The "v" being part of the version feels weird. I observed this in the example. It's not clear to me, in the case where the version starts with "v0.0.0" whether or not the version should just be the qualifier/hash that follows. Am I anywhere close? |
Hi @waynebeaton! Thanks for the comments :) You are indeed very close. I chose to include the "v" because I noticed that it is used in go.mod files like this one and it seemed cleaner to include it. I'm not completely attached to it, though. I believe we have used the "v" in other types of components when specifying the revision, so it feels more consistent to include it. My current plan, in the case of no path, is to use a "-" for the namespace. So
would correspond to
Something I am having some trouble with is that I use proxy.golang.org to download the modules's source. However, modules that are in the go standard library are available from proxy.golang.org. Thank you so much! |
I probably should have started by saying that I'm completely new to Go and am just fumbling around trying to figure out how to grab and license-check dependencies... My preference is to include the standard library modules. Having said that, I can't say that I've ever really thought too hard about the Java runtime... I guess that I'm not sure. |
If it's always there, then it doesn't really add any value. I can't say that I have looked at even a fraction of the ClearlyDefined data, but I don't recall ever having observed another ID with a "v" prefixing the revision. My strong preference is consistency in the format (i.e., to not include it unless it has actual specific meaning). I also recall (this might have been in the Google Doc) discussion regarding how the revision is represented when it starts with "0.0.0". When we encounter, for example, "v0.0.0-20190418145605-e7d98fc518a7", the actual revision would be "20190418145605-e7d98fc518a7". Has a decision been made about that? |
Hi @waynebeaton - I'm agree with you on the "v" not really having a meaning in this case and I'm willing to remove it. I may go back on this if it makes it harder to script automated checks against ClearlyDefined (for example, an application that parses through go dependencies and queries ClearlyDefined for each dependency). Sounds like something to experiment with! |
87778a2
to
7c14adc
Compare
Hi @waynebeaton - apologies for the delay in responding. You are correct in that it's a bit weird that it points to the information page, rather than the source. I'm also running into the problem that there does not appear to be any consistent way (not that I've found, at least) for determining a pointer to the actual source archive. |
TestingJust did some testing (using both this pull request and the equivalent crawler pull request in a local environment) Modules that harvested fine(Harvested successfully, all tools ran find, found declared and discovered licenses)
Modules where the discovered licenses were found, but not declared licenses
Modules where the harvests returned errorslocalhost:3000/definitions/go/golang/github.com%2fAzure%2fgo-autorest/autorest/v0.11.20
localhost:3000/definitions/go/golang/github.com%2fAzure/azure-sdk-for-go/v55.8.0+incompatible
localhost:3000/definitions/go/golang/golang.org%2fx/crypto/v0.0.0-20210711020723-a769d52b0f9
|
AFAICT, we can make some good guesses based on GitHub URLs and version matching against tags, but it'd sure be handy to have that information in metadata. I'll keep digging to see what I can figure out. |
More notes on the test components where we are not finding the declared license: localhost:3000/definitions/go/golang/golang.org%2fx/net/v0.0.0-20210405180319-a5a99cb37ef4
localhost:3000/definitions/go/golang/software.sslmate.com%2fsrc/go-pkcs12/v0.0.0-20210415151418-c5206de65a78
localhost:3000/definitions/go/golang/github.com%2fsatori/go.uuid/v1.2.1-0.20181028125025-b2ce2384e17b
Other Modules that do not show a declared license (but do show discovered license(s))GitHub.com modules
Maybe something to do with GitHub being the source? golang.org modules
Notes on modules that do show a declared licensehttp://localhost:3000/definitions/go/golang/code.cloudfoundry.org/clock/v1.0.0
http://localhost:3000/definitions/go/golang/go.uber.org/fx/v1.14.2
|
Next steps for this week:
|
I think I know what's going on with the go modules that, when we harvest them, do not show a Declared license. The key is this function in lib/utils.js function getLicenseLocations(coordinates) {
const map = { npm: ['package/'], maven: ['meta-inf/'], pypi: [`${coordinates.name}-${coordinates.revision}/`], go: [`${coordinates.namespace}/${coordinates.name}@${coordinates.revision}/`] }
return map[coordinates.type]
} We are looking for licenses in `${coordinates.namespace}/${coordinates.name}@${coordinates.revision}/` For modules like http://localhost:3000/definitions/go/golang/code.cloudfoundry.org/clock/v1.0.0, where the declared license is found, the structure of the unpacked module is like this:
The license file is in the code.cloudfoundry.org/[email protected] directory, it matches However, with a module like http://localhost:3000/definitions/go/golang/github.com%2fgoogle%2fgo-github/v32/v32.1.0, the unpacked module is structured like this:
The license file path is github.com/google/go-github/[email protected]/LICENSE, which does not match I will update the license file path to include directories in between the namespace directory and the name@version directory. |
The latest commits have fixed the issues with finding declared licenses for the vast majority of go modules! |
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
7750aa0
to
3c6a8ee
Compare
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
…or license files in go modules Signed-off-by: Nell Shamrell <[email protected]>
3c6a8ee
to
542c027
Compare
Just did a rebase and this seems to be in good shape. There are a couple of things still to do before this is ready for review:
|
With regard to the question of whether we should include the "v" in revisions for go modules, I've been giving this some thought. I asked someone to send several sample go.mod files to me and noticed that, for each revision defined in them, the v is included. I also took a look at some go.sum files and noticed they follow the same convention - including the "v" for a defined version of a module. Additionally, I took a look at how versions are listed on pkg.go.dev (example) and they also include the v (additionally, you include the v when you request a version of a revision through the proxy.golang.org. The convention in the go community seems to be to include the "v" and I believe ClearlyDefined should follow the community convention and include the "v" in revisions. |
Hi @fossygirl! I have a question on this issue from 2018 #228 It appears that, when a go module is in a repository with multiple go modules. And we download a go module's source through proxy.golang.org (which is implemented in the related crawler PR and analyzed here in the Service), we only get the source code for the individual go module, not the entire repository. Currently, we search the source code for the individual go module for license files and determine the declared/discovered licenses from there. Do you see a conflict with the suggestions/requirements in #228? |
@nellshamrell I don't know about conflicts in Go. @jeffmcaffer @jeffmendoza might be good people to talk to here. |
@fossygirl and I talked offline - I think we are ok (as far as I can tell) using the license in the module's directory, even if it is part of a larger repo. |
Signed-off-by: Nell Shamrell <[email protected]>
Regarding the UI - the only change that would be needed in the website would be in the page where harvests can be queued. This page is currently undergoing a major overhaul and I plan on waiting from adding go to this page until that design is complete and deployed. |
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Signed-off-by: Nell Shamrell <[email protected]>
Looks like |
Signed-off-by: Nell Shamrell <[email protected]>
Good point, @jeffmendoza! Done! |
Cool, looks great. Happy to see this coming together. |
Adding Go Support
Summary
This pull request adds support for harvesting and calculating definitions of go modules.
This has been one of the most frequently requested enhancements to ClearlyDefined (including in #765).
Limitations
This pull request only adds in support for go components with a defined go.mod file in them.
Modules were added to Go in Go 1.11 and 1.12 as a dependency management system that "makes dependency version information explicit and easier to manage".
Prior to modules, there were other 3rd party version management tools which are likely still used by some people today. Adding support for these is something we can explore in the future, but to start with we are only supporting modules.
Coordinates
A go module's coordinates are formed like this:
A complication we encountered early in the architecture process for go support is that module import paths have a wide variety of characters allowed. Additionally, it is very common for module import paths to have multiple slashes in the "namespace". You can see the discussion around this in #862, #864, and #865.
The solution this pull request proposes is to use url encoding for "/" in namespaces.
For example, this import path:
Becomes these coordinates:
This encoding must be used whenever requesting these coordinates, whether for queuing up a harvest or requesting a definition.
This will require documentation, which will be added to the ClearlyDefined website.
Related Pull Requests
When this pull request is merged and deployed, these pull requests must be merged and deployed as well.