Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

Question about SPDX Light: Supported Fields #2

Open
mcjaeger opened this issue Jan 15, 2019 · 4 comments
Open

Question about SPDX Light: Supported Fields #2

mcjaeger opened this issue Jan 15, 2019 · 4 comments
Labels
good first issue Good for newcomers

Comments

@mcjaeger
Copy link

hello,

regarding the SPDX light proposal I would like to express more a question rather than an issue. I like the SPDX light proposal very much. I was wondering about the following additional elements more like a question:

  • for package information: I found the checksum very useful to exchange information about packages, maybe it could be considered as well? is it maybe confusing hwne the same package was compiled multiple times?

  • How about an acknowledgement field attached to license information? (For licenses that ask for acknowledgement, such as https://spdx.org/licenses/BSD-4-Clause-UC.html because then, acknowledgement documentation could easily generated from SPDX.

  • Export control and customs, ECC notice (since patent notice is already envisaged) for a package could be used (with reference in which file it was found)

  • would be package download location also the package management id? (for example it is named "artefact id" for maven packages)

  • ignore flag for files which could be info that this file was not part of license analysis or isnot considered as license analysis because it is considered irrelevant.

Please see my remarks just as quick feedback from the posting on openchain mailing list. My idea was it could be a good place here to ask a question about this document:

https://github.com/OpenChain-Project/Japan-WG-General/blob/master/License-Info-Exchange/Doc-at-Meeting/Candidate-of-SDPX-light.md

@hfukuchi
Copy link
Collaborator

Thank you so much for having an interest in the concept of "SPDX light" and your good comments.
We will discuss in the subgroup and feedback to you.

@hfukuchi hfukuchi added the good first issue Good for newcomers label Jan 16, 2019
@NorioKobota
Copy link
Member

Thank you!

@mcjaeger
I feel that all of your remarks are very useful. But there are some things I can not understand.
e.g. Where to apply 'ignore flag' etc.

So I would like to discuss using concrete samples (attach here).
Could you tell me how to apply your remarks?

To everyone
As this sample is created by my own understanding, it may contain mistakes.
If so, please point out it and comment here.

example.pdf

@mcjaeger
Copy link
Author

sorry for the late reply I was missing your update. Ignore could be files that are not relevant for the distribution in an own product, such as:

  • SCM information, like files in a .git folder or .gitignore
  • supplemental files of the git repository, such as .travis.yml, Contributing.md etc.
  • files that will not go into distro for other reasons (test, documentation, etc)

Document Notation and Parrsing

To look at the file example.pdf: I see the notation a problem with fossology, as the free form tag value is not so obvious to parse. A notation like XML (not optimal) or JSON (maybe) or yml (maybe) has the advantage that standard parser classes will be available.

When importing SPDX files in fossology, we found that hash values for each file is very sueful, because the file path itself can easily differ for individual files taking two different packaging of the OSS components. For reusing SPDX information with newer versions or for storing SPDX information in large repository, a hash based approach is very helpful.

Document Space Saving Aspects

Another issue that I see is with super large OSS components, such as Eclipse Birt, gcc or the linux kernel. It turns out that open source packages having more that 50k files result in a very large document, just because of the sheer number of files and all over repeated key names.

Not talking about Chromium here which is a extreme example.

This is a problem also at importing SPDX files into FOSSology again, because the RDF format actually not only needs to paste the raw data but also requires to materialise the tree structure of RDF. I think a more tabular format would be more helpful (file path and name, hash, license, copyright statements). In fact one could save space, by just listing the files sorted by licenses.

More suggestions

thinking of fossology we have the situation that a SPDX files is actually generated multiple times, until the user has reached a final state. So likely different files will fly around on the user's file system. Maybe a version or generations tamp or generation checksum consisting of found licenses, filled out attributes could help the user to distinguish differently generated SPDX files of the same package.

I am also tempted to think about a field which is more like a purl coordinate into the package metadata. I see this approach more and more popular. purl is here: https://github.com/package-url/purl-spec but I am not sure if purl is it eventually.

Questions

Would there be a difference between NOASSERTION and NONE?

@NorioKobota
Copy link
Member

NorioKobota commented Mar 12, 2019

Hi @mcjaeger ,

Sorry for late reply.

I understood that it is very important to make parsing easier and save space to hold results.
And it also seems very practical to use 'purl' to represent package metadata.
Recently we are starting to collaborate SPDX WG members, so I'd like to introduce them these proposal.
"https://github.com/OpenChain-Project/Japan-WG-General/issues/6"

But as writing above, we'd like to use SPDX as it is because many tools supports SPDX already.
Or if you can change spec. and impl. tools such as SW360 and FOSSology you contribute (of course I'd like to contribute it in future), please tell me the detail.

Would there be a difference between NOASSERTION and NONE?

This defined by SPDX like below.
<https://spdx.org/spdx-specification-21-web-version#h.49x2ik5>

NONE
means that it does not exist originally.
NO ASSERTION
means not intentionally outputting.

hfukuchi added a commit that referenced this issue Sep 24, 2019
hfukuchi pushed a commit that referenced this issue Oct 3, 2019
Update repo

from Master
Update 20191003
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants