Add CodeQL Workflow for Code Security Analysis #3476

b4yuan · 2023-10-28T03:55:56Z

Summary

This pull request introduces a CodeQL workflow to enhance the security analysis of this repository.

What is CodeQL

CodeQL is a static analysis tool that helps identify and mitigate security vulnerabilities. It is primarily intra-function but does provide some support for inter-function analysis. By integrating CodeQL into a GitHub Actions workflow, it can proactively identify and address potential issues before they become security threats.

For more information on CodeQL and how to interpret its results, refer to the GitHub documentation and the CodeQL documentation (https://codeql.github.com/ and https://codeql.github.com/docs/).

What this PR does

We added a new CodeQL workflow file (.github/workflows/codeql.yml) that

Runs on every pull request (functionality to run on every push to main branches is included as a comment for convenience).
Runs daily.
Excludes queries with a high false positive rate or low-severity findings.
Does not display results for git submodules, focusing only on our own codebase.

Validation

To validate the functionality of this workflow, we have run several test scans on the codebase and reviewed the results. The workflow successfully compiles the project, identifies issues, and provides actionable insights while reducing noise by excluding certain queries and third-party code.

Using the workflow results

If this pull request is merged, the CodeQL workflow will be automatically run on every push to the main branch and on every pull request to the main branch. To view the results of these code scans, follow these steps:

Under the repository name, click on the Security tab.
In the left sidebar, click Code scanning alerts.

Is this a good idea?

We are researchers at Purdue University in the USA. We are studying the potential benefits and costs of using CodeQL on open-source repositories of embedded software.

We wrote up a report of our findings so far. The TL;DR is “CodeQL outperforms the other freely-available static analysis tools, with fairly low false positive rates and lots of real defects”. You can read about the report here: https://arxiv.org/abs/2310.00205

Review of engineering hazards

License: see the license at https://github.com/github/codeql-cli-binaries/blob/main/LICENSE.md:

Here's what you may also do with the Software, but only with an Open Source Codebase and subject to the License Restrictions provisions below:

Perform analysis on the Open Source Codebase.

If the Open Source Codebase is hosted and maintained on GitHub.com, generate CodeQL databases for or during automated analysis, CI, or CD.

False positives: We find that around 20% of errors are false positives, but that these FPs are polarized and only a few rules contribute to most FPs. We find that the top rules contributing to FPs are: cpp/uninitialized-local, cpp/missing-check-scanf, cpp/suspicious-pointer-scaling, cpp/unbounded-write, cpp/constant-comparison, and cpp/inconsistent-null-check. Adding a filter to filter out certain rules that contribute to a high FP rate can be done simply in the workflow file.

Add CodeQL Workflow for Code Security Analysis This pull request introduces a CodeQL workflow to enhance the security analysis of our repository. CodeQL is a powerful static analysis tool that helps identify and mitigate security vulnerabilities in our codebase. By integrating this workflow into our GitHub Actions, we can proactively identify and address potential issues before they become security threats. We added a new CodeQL workflow file (.github/workflows/codeql.yml) that - Runs on every push and pull request to the main branch. - Excludes queries with a high false positive rate or low-severity findings. - Does not display results for third-party code, focusing only on our own codebase. Testing: To validate the functionality of this workflow, we have run several test scans on the codebase and reviewed the results. The workflow successfully compiles the project, identifies issues, and provides actionable insights while reducing noise by excluding certain queries and third-party code. Deployment: Once this pull request is merged, the CodeQL workflow will be active and automatically run on every push and pull request to the main branch. To view the results of these code scans, please follow these steps: 1. Under the repository name, click on the Security tab. 2. In the left sidebar, click Code scanning alerts. Additional Information: - You can further customize the workflow to adapt to your specific needs by modifying the workflow file. - For more information on CodeQL and how to interpret its results, refer to the GitHub documentation and the CodeQL documentation. Signed-off-by: Brian <[email protected]>

Add CodeQL Workflow for Code Security Analysis This pull request introduces a CodeQL workflow to enhance the security analysis of our repository. CodeQL is a powerful static analysis tool that helps identify and mitigate security vulnerabilities in our codebase. By integrating this workflow into our GitHub Actions, we can proactively identify and address potential issues before they become security threats. We added a new CodeQL workflow file (.github/workflows/codeql.yml) that - Runs on every pull request (functionality to run on every push to main branches is included as a comment for convenience). - Runs daily. - Excludes queries with a high false positive rate or low-severity findings. - Does not display results for git submodules, focusing only on our own codebase. Testing: To validate the functionality of this workflow, we have run several test scans on the codebase and reviewed the results. The workflow successfully compiles the project, identifies issues, and provides actionable insights while reducing noise by excluding certain queries and third-party code. Deployment: Once this pull request is merged, the CodeQL workflow will be active and automatically run on every push and pull request to the main branch. To view the results of these code scans, please follow these steps: 1. Under the repository name, click on the Security tab. 2. In the left sidebar, click Code scanning alerts. Additional Information: - You can further customize the workflow to adapt to your specific needs by modifying the workflow file. - For more information on CodeQL and how to interpret its results, refer to the GitHub documentation and the CodeQL documentation (https://codeql.github.com/ and https://codeql.github.com/docs/). Signed-off-by: Brian <[email protected]>

raysan5 · 2023-10-29T16:01:19Z

@b4yuan It seems the check process failed.

In any case, how does this PR compare to #3152?

b4yuan · 2023-10-29T16:09:42Z

@raysan5 check process failed because there are some errors found by CodeQL. If the analysis finds any errors, it will invoke fail_on_error.py which prevents the workflow from 'passing.'

This PR is different than #3152 because it allows for rule filtering; uploads the results as an artifact; and allows you to view the results of the analysis in the Security tab under Code scanning alers.

Add CodeQL Workflow for Code Security Analysis This pull request introduces a CodeQL workflow to enhance the security analysis of our repository. CodeQL is a powerful static analysis tool that helps identify and mitigate security vulnerabilities in our codebase. By integrating this workflow into our GitHub Actions, we can proactively identify and address potential issues before they become security threats. We added a new CodeQL workflow file (.github/workflows/codeql.yml) that - Runs on every pull request (functionality to run on every push to main branches is included as a comment for convenience). - Runs daily. - Excludes queries with a high false positive rate or low-severity findings. - Does not display results for git submodules, focusing only on our own codebase. Testing: To validate the functionality of this workflow, we have run several test scans on the codebase and reviewed the results. The workflow successfully compiles the project, identifies issues, and provides actionable insights while reducing noise by excluding certain queries and third-party code. Deployment: Once this pull request is merged, the CodeQL workflow will be active and automatically run on every push and pull request to the main branch. To view the results of these code scans, please follow these steps: 1. Under the repository name, click on the Security tab. 2. In the left sidebar, click Code scanning alerts. Additional Information: - You can further customize the workflow to adapt to your specific needs by modifying the workflow file. - For more information on CodeQL and how to interpret its results, refer to the GitHub documentation and the CodeQL documentation (https://codeql.github.com/ and https://codeql.github.com/docs/). Signed-off-by: Brian <[email protected]>

raysan5 · 2023-11-01T11:24:25Z

@jorgectf Please, could you take a look to this PR? Is it similar to yours?

Are the shell and python scripts really required?

raysan5 · 2023-11-03T10:51:21Z

@b4yuan After some thinking about this PR, I'm afraid I'm not merging it. Actually, the PR seems generated in an automatic way, I saw you pushed similar PRs to multiple repos. Personally it does not inspire confidence to me. Sorry.

If there was a more detailed explanation why this addition is specifically needed and beneficial for raylib I could reconsider my position but at this moment I can't see why raylib project specifically needs to integrate this kind of system.

@b4yuan Are you using raylib in any specific project or in any specific environment where this kind of CodeQL validations are required? Why did you consider it is specifically beneficial for raylib project? Are you pushing CodeQL to multiple repos as some kind of commercial strategy for a company?

Feel free to answer me here or just send me a private mail to: [email protected]

Thanks.

b4yuan · 2023-11-03T12:44:51Z

@raysan5 Thanks for the comments. While the PR itself is automated, they were only raised after we tested CodeQL on our own forks of each repositories. We actually ran CodeQL on these repositories (raylib included, of course) months ago and raised PRs for bugs we found. Specifically in raylib's case, we raised #3021 to fix a bug we found with CodeQL.

The goal here is to improve open-source embedded projects. Even if you think that raylib doesn't need a security analysis, we would suggest your consideration, as it is hard to judge where the open-source code is running. There are many stories of unexpected use cases that resulted in security problems down the road.

In the paper: https://arxiv.org/abs/2310.00205, we present our findings here and why we think that CodeQL is a beneficial tool that devs should leverage. We are not commercially interested in any way; we are only interested in improving code quality.

orcmid · 2023-11-03T14:52:24Z

@b4yuan @raysan5
Wow, the abstract tells it all. In particular, I gather that CodeQL is a GitHub facility, so there is no commercial interest other than GitHub's in having secure code in repositories under its care.

I have received warning notices, usually on dependencies that have security issues. It takes a little work to determine whether my usage (of a JavaScript framework for example) is transitively vulnerable or not. Just the same, I'd rather have that checking than not.

I don't use GitHub workflows, so I might have to rethink that on some code that I will need very strong safety assurance about. Thanks for the information. I am mindful that these fixtures do not prove code is safe, only that it appears not to be.

PS: I have been oblivious to the arrival of the Security tab on my repository views. There's much to digest here, and much that can be useful for raylib whether or not CodeQL is set up.

PPS: It appears that one can do some of this on a fork, so maybe there's a benign way to do this without negotiating with repo owners. I am not likely to try this, though.

raysan5 · 2023-11-03T17:08:35Z

@b4yuan thank you very much for your answers and excuse if my tone could have been a bit rude.

I checked the generated reports and I even reviewed some of the issues (64d64cc) but I noticed many of them are not that relevant and also related to raylib external libraries, where I can't do much from my side.

Some questions (and excuse my insistence on this topic):

Would it be possible to avoid the check on src/external libraries? or that's not recommended?
Would it be possible to avoid the shell and python scripts and integrate its logic directly on the workflow?
I saw the check running takes some time, does it has any cost over running time quota on an open-source project like raylib?
I don't think raylib is so critical to run it on every push, would it be ok to just run it manually when desired or once a week?

Again, if you feel more confortable answering in private, you can write me to [email protected]

Thanks for your answers.

b4yuan · 2023-11-03T17:09:23Z

@orcmid @raysan5 That's correct, CodeQL was developed by GitHub.

Regarding dependencies... I also want to add that third party lib paths can be filtered out in this section:

    # Filter out rules with low severity or high false positve rate
    # Also filter out warnings in third-party code
    - name: Filter out unwanted errors and warnings
      uses: advanced-security/filter-sarif@v1
      with:
        patterns: |
          -**:cpp/path-injection
          -**:cpp/world-writable-file-creation
          -**:cpp/poorly-documented-function
          -**:cpp/potentially-dangerous-function
          -**:cpp/use-of-goto
          -**:cpp/integer-multiplication-cast-to-long
          -**:cpp/comparison-with-wider-type
          -**:cpp/leap-year/*
          -**:cpp/ambiguously-signed-bit-field
          -**:cpp/suspicious-pointer-scaling
          -**:cpp/suspicious-pointer-scaling-void
          -**:cpp/unsigned-comparison-zero
          -**/cmake*/Modules/**
         #dir/path/here

and CodeQL would not analyze those directories. (I looked for git submodules in raylib, but didn't seem to find any. If there are any third party directory paths that you want to be filtered out, let me know, and I can update the workflow accordingly.) This workflow uploads all of CodeQL's errors to the Code scanning alerts section under the Security tab. From there, you can mark bugs that CodeQL finds as 'will not fix', 'false positive', etc. If you find that a rule yields especially many false positives, you can filter that rule altogether in the above snippet. (We've also suggested the rules that are top contributors to false positives in the PR body, which you may find helpful).

Of course, it is up to you to weigh the benefits that CodeQL brings (since we have found a merged bug fix in this repo before found by CodeQL, this suggests that it likely would add value), compared to the drawbacks (perhaps false positive rate, maintenance, etc). We are very transparent with the drawbacks in the paper, but we find that improving false positive rate can be improved with rule filtering, and we don't believe maintenance will consume too many resources, though, again, that is up to the judgement of the devs. Not only have we already configured CodeQL for this repo (and tested that it works on our fork), the streamlined way that GitHub presents CodeQL analysis findings in its Security tab enables this tool to be used much more easily.

b4yuan · 2023-11-03T17:14:28Z

@raysan5 Looks like we posted at the same time lol.

Would it be possible to avoid the check on src/external libraries? or that's not recommended?

Yes, I can update this shortly.

Would it be possible to avoid the shell and python scripts and integrate its logic directly on the workflow?

I'll look into this as well.

I saw the check running takes some time, does it has any cost over running time quota on an open-source project like raylib?

From https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration: "GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the account's plan. Any usage beyond the included amounts is controlled by spending limits. For more information, see "About billing for GitHub Actions."

As long as resources are available, it should be free for an open-source repo like raylib.

I don't think raylib is so critical to run it on every push, would it be ok to just run it manually when desired or once a week?

Running on every push is already commented out; it's just there if you wanted it to be triggered on every push. Currently the workflow is scheduled to run daily, but I can also make changes to update it weekly as well.

Will update/ping when your asks are addressed.

raysan5 · 2023-11-03T17:25:14Z

@b4yuan thanks for your answers. Let's see if the following concerns could be addressed:

External libraries used by raylib, located in raylib/src/external directory, but note that most of those libraries are single-file header-only libraries and are actually build with raylib modules, not sure if the system could avoid those checks because, technically, on compilation those libraries are merged inside the modules code.
External shell/python scripts. I did a quick check and the shell script should be avoidable, about the python script, not sure if it can be integrated in some way on the workflow...
Running time. If possible I would launch it once-a-week, in any case, I can also launch it manually when required, right?

I'll try to review the workflow more carefully when merged to understand the process and be able to maintain it.

Thanks.

EDIT: I see the Errors detected actually belong to an external library (GLFW), I'm afraid I can't do much there beside reporting them. Is it possible to skip that fail on error?

orcmid · 2023-11-03T18:04:32Z

@raysan5 @b4yuan

External libraries used by raylib, located in raylib/src/external directory, but note that most of those libraries are single-file header-only libraries and are actually build with raylib modules, not sure if the system could avoid those checks because, technically, on compilation those libraries are merged inside the modules code.

EDIT: I see the Errors detected actually belong to an external library (GLFW), I'm afraid I can't do much there beside reporting them. Is it possible to skip that fail on error?

These are interesting dependency cases. In some sense there need to be warnings but they have to be handled at the origin. That they can be false positives or irrelevant because of what the dependency is can be troublesome. The standing can change over time, depending on changes in raylib, so suppressing them has to be cautious.

Is there a way to make them warnings rather than fail/error cases? Then we can hope the originator provides a fix assuming its not a false positive.

PS: There is a different though related issue concerning raylib externals. The raylib release cadence does not adjust to fixes and update in the external dependencies. Rather, a raylib release simply incorporates the latest dependency releases as they are at the time. There has not been, as far as I know, any effort to introduce a raylib security release to quickly promulgate fixes to security vulnerabilities in an external. Perhaps it has never come up.

b4yuan · 2023-11-03T19:53:12Z

@raysan5 @orcmid

At the moment I have excluded src/external//glfw from the CodeQL results. I've also incorporated the build script into the workflow. To remove fail_on_error.py is probably possible, but not without making the workflow file unnecessarily long and confusing. What do you think?

(I also can't view the results in the Security tab of raylib. Let me know if the errors from the external library are no longer being analyzed.)

raysan5 · 2023-11-03T20:14:46Z

@b4yuan Thanks for the review! I just run it again and it seems no error is detected but still it seems to report some error:

(I filtered by Severity: Error)

What implies removing fail_on_error.py? Reporting the process finished correctly despite containing the errors? If that's the case, I think it can be removed.

b4yuan · 2023-11-03T20:19:37Z

The analysis step will pass even if there are errors found, because it finished 'correctly'. fail_on_error.py parses the results, and if there are any errors, forces the entire workflow to fail. This may come in handy visually when inspecting PRs, weekly checks, etc.

(I will test this on our fork and see what might be going on). Alternatively, if you think this step is excessive, the script can be removed altogether.

jorgectf · 2023-11-04T11:56:54Z

👋 Late to the convo, but just wanted to highlight a setting that might replace the script sharing the same objective: https://github.com/raysan5/raylib/settings/security_analysis

Protection rules under Code Scanning:

raysan5 · 2023-11-04T20:08:30Z

@jorgectf Oh! Thank you very much for your answer! Actually I was wondering that same thing, it was a bit strange to me the need of an error-scanning script to set the fail in the process...

@b4yuan Given the new details provided, could then the script just be removed from the workflow?

b4yuan · 2023-11-06T17:39:17Z

Thanks @raysan5 @jorgectf for bringing this up. Updated with the removal of the python script.

.github/workflows/codeql.yml

raysan5 · 2023-11-06T17:50:19Z

Running workflow...

raysan5 · 2023-11-06T18:11:46Z

@b4yuan @jorgectf thank you very much for working on this improvement, reviewing it and explaining all the details. I'm merging it.

b4yuan added 3 commits October 11, 2023 19:40

raysan5 added a commit that referenced this pull request Nov 1, 2023

REVIEWED: Potential code issues reported by CodeQL #3476

64d64cc

raysan5 closed this Nov 3, 2023

raysan5 reopened this Nov 3, 2023

b4yuan and others added 2 commits November 3, 2023 15:38

Run CodeQL Analysis weekly + incorporate build script into WF

c3c02d5

Merge branch 'raysan5:master' into master

a408c63

remove: fail_on_error.py

282c2e0

raysan5 reviewed Nov 6, 2023

View reviewed changes

.github/workflows/codeql.yml Outdated Show resolved Hide resolved

update: codeql.yml

e9cf1a3

raysan5 merged commit b216e2f into raysan5:master Nov 6, 2023
2 checks passed

jakubmanczak mentioned this pull request Nov 18, 2023

raylib 5.0 Homebrew/homebrew-core#154772

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CodeQL Workflow for Code Security Analysis #3476

Add CodeQL Workflow for Code Security Analysis #3476

b4yuan commented Oct 28, 2023 •

edited

Loading

raysan5 commented Oct 29, 2023

b4yuan commented Oct 29, 2023

raysan5 commented Nov 1, 2023 •

edited

Loading

raysan5 commented Nov 3, 2023 •

edited

Loading

b4yuan commented Nov 3, 2023

orcmid commented Nov 3, 2023 •

edited

Loading

raysan5 commented Nov 3, 2023

b4yuan commented Nov 3, 2023

b4yuan commented Nov 3, 2023

raysan5 commented Nov 3, 2023 •

edited

Loading

orcmid commented Nov 3, 2023 •

edited

Loading

b4yuan commented Nov 3, 2023

raysan5 commented Nov 3, 2023 •

edited

Loading

b4yuan commented Nov 3, 2023

jorgectf commented Nov 4, 2023 •

edited

Loading

raysan5 commented Nov 4, 2023

b4yuan commented Nov 6, 2023

raysan5 commented Nov 6, 2023

raysan5 commented Nov 6, 2023

Add CodeQL Workflow for Code Security Analysis #3476

Add CodeQL Workflow for Code Security Analysis #3476

Conversation

b4yuan commented Oct 28, 2023 • edited Loading

Summary

What is CodeQL

What this PR does

Validation

Using the workflow results

Is this a good idea?

Review of engineering hazards

raysan5 commented Oct 29, 2023

b4yuan commented Oct 29, 2023

raysan5 commented Nov 1, 2023 • edited Loading

raysan5 commented Nov 3, 2023 • edited Loading

b4yuan commented Nov 3, 2023

orcmid commented Nov 3, 2023 • edited Loading

raysan5 commented Nov 3, 2023

b4yuan commented Nov 3, 2023

b4yuan commented Nov 3, 2023

raysan5 commented Nov 3, 2023 • edited Loading

orcmid commented Nov 3, 2023 • edited Loading

b4yuan commented Nov 3, 2023

raysan5 commented Nov 3, 2023 • edited Loading

b4yuan commented Nov 3, 2023

jorgectf commented Nov 4, 2023 • edited Loading

raysan5 commented Nov 4, 2023

b4yuan commented Nov 6, 2023

raysan5 commented Nov 6, 2023

raysan5 commented Nov 6, 2023

b4yuan commented Oct 28, 2023 •

edited

Loading

raysan5 commented Nov 1, 2023 •

edited

Loading

raysan5 commented Nov 3, 2023 •

edited

Loading

orcmid commented Nov 3, 2023 •

edited

Loading

raysan5 commented Nov 3, 2023 •

edited

Loading

orcmid commented Nov 3, 2023 •

edited

Loading

raysan5 commented Nov 3, 2023 •

edited

Loading

jorgectf commented Nov 4, 2023 •

edited

Loading