Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC 2023 brainstorming thread #2354

Closed
terriko opened this issue Nov 16, 2022 · 5 comments
Closed

GSoC 2023 brainstorming thread #2354

terriko opened this issue Nov 16, 2022 · 5 comments
Labels
gsoc Tasks related to our participation in Google Summer of Code

Comments

@terriko
Copy link
Contributor

terriko commented Nov 16, 2022

This thread is intended for folk to brainstorm about potential project ideas for GSoC 2023. The idea is to have a separate thread from the "GSoC start here" issue (#2230) so that we have a place where we can talk about ideas that may be completely not viable. Once we get some ideas that are viable and well-described enough that a GSoC contributor could take it and run with it, we'll add them to #2230.

Some thoughts off the top of my head:

  1. Improved VEX triage tools
    • Some user tools (and usage docs) for helping people do triage on cve-bin-tool scans and produce VEX + SBOM output. Maybe even GUI based tooling for triage, hooks for an existing tool (like VSCode) that might allow for easier JSON editing already, or an ability to save from a local HTML report?
  2. Improved product representation & meta-info about products.
    • We currently just report whatever we called a thing internally in the binary scans, and whatever it was called in the file in non-binary scans such as SBOM or package list parsing, but it would be nice to include things like software heritage designations, especially to allow for de-duplication if we combine scans from multiple BOMs. We might also want to see how viable it is to provide other commonly desired meta-info like licensing, source urls, packaging data, etc.
    • You will almost certainly need to build a data format for de-dupe / meta data and allow users to be able to add to it, similar to how we have checkers right now.
    • Note: I expect this one to be hard and require a lot of heuristic work. 350hr minimum probably.
    • The goal here isn't perfection, but if we could say tag 50-70% of things with additional meta-data that might be enough.
  • Scanning the internet with cve-bin-tool.
    • Work on setting up github actions workflows and tooling to support scanning of hundreds of repos and handling problems at scale like shared triage, pretty overview graphs, monitoring, whatever else might be needed. (Imagine if you were, say, the Python Software Foundation wanting to scan every repo associated with a pypi project, or a corporation trying to scan your own public projects. what additional tools would you need to manage that much data?)

Anyhow, even "bad" ideas are worth discussing at brainstorming stage, so don't be shy even if something might not wind up viable for gsoc. Brainstorm away!

@terriko terriko added the gsoc Tasks related to our participation in Google Summer of Code label Nov 16, 2022
@anthonyharrison
Copy link
Contributor

anthonyharrison commented Nov 17, 2022

@terriko There is a lot happenning in the VEX world. VEX is a concept and not a standard so we should support the CSAF format which is being sited as being an example of a VEX as well as the VEX format (which is based on CycloneDX) which cve-bin-tool currently supports. We should possibly look at providing some SBOM/VEX tooling around cve-bin-tool to provide an overall vulnerability managament solution :

  • Scan product using cve-bin-tool
  • Create SBOM and VEX from the scan. The SBOM will need to be linked to the VEX I think
  • Triage the VEX and update the status of the vulnerabilities
  • Repeat

Think this is a perfect candidate for wrapping up in a HCI (using Flask, FastAPI or Django?)

Another idea would be to create an API for cve-bin-tool instead of just the command line. Think this might enable the tool to be used in some of the vulnerability portals which are starting to appear.

Scanning the internet might take some time. :-) Think it would be good if we could offer hooks for package installers to do a scan (and generate sbom, vex, ...). Having cve-bin-tool as part of Python packaging would be a great addition (would this help prevent malware being installed on pypi?)

@terriko
Copy link
Contributor Author

terriko commented Feb 2, 2023

Okay, I've got the EPSS and meta-data ideas linked on the main gsoc issue, which is good enough for me to add us as a Python sub-org and start getting things rolling there.

I feel like the other big things I want to look at for next release:

  • sbom export / tools
  • improved triage workflows
  • caching

SBOM export I'm guessing we will have at least basically working soon, and @anthonyharrison is already working on some other tools. Is there anything missing in that space or that we need for cve-bin-tool operation that we could turn into a gsoc-sized project?

Triage workflows: We've got more than one gsoc project's worth of ideas here and I'm struggling to prioritize but I'll see if I can work up something today.

I put up a start of a caching idea but there's some open questions there that might make it not the most self-contained for gsoc, and someone is already interested in it outside of the scope of the gsoc program.

Anything else I should be considering turning into an actual project idea?

@anthonyharrison
Copy link
Contributor

@terriko The triage work I think could be split into two (or be a 350 hour project). One is the generation of the various VEX documents (CyclonedDX - done, CSAF - in progress, OpenVEX - to be done) and I would also like to add Vulnerability Disclosure Reports (see NIST SP 800-161). The other aspect I see would be creating tools around the triage lifecycle managing the various artefacts, updating them rather than editing JSON documents, etc. This could initially be a simple UI but as with most of these simple ideas it will probably grow into something far more complex :-)

I think the SBOM work surrounds generation of a document (stretch target would be to update an existing one) from a scan (I have an almost working prototype now! which will also help with the production of a VDR). I have tools to compare SBOMS (sbomdiff ) and document an SBOM (in Markdown, PDF and to a console) (sbom2doc so the only thing I can see would be to enhance/develop github actions to scan a repo, create a SBOM, document it and compare against the previous version. Useful but possibly not big enough for a GSOC project on its own, but if we add an audit function to cve-bin-tool to compare repeated vulnerability scans of the same SBOM, binaries etc and then notify a user if there is a change (e.g. a new vulnerability) then that might be useful and be a bigger project.

However, I would like to see if there is some audit activity which could

@terriko
Copy link
Contributor Author

terriko commented Feb 2, 2023

@anthonyharrison I dumped a bunch of triage thoughts into #2639. Updates and more precise ideas welcome. :)

@terriko
Copy link
Contributor Author

terriko commented Apr 19, 2023

Closing this since GSoC applications are in and we no longer need this brainstorming thread.

@terriko terriko closed this as completed Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsoc Tasks related to our participation in Google Summer of Code
Projects
None yet
Development

No branches or pull requests

2 participants