Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repository plugin - support scanning historical git commits #66

Closed
jossef opened this issue May 11, 2023 · 4 comments · Fixed by #88
Closed

repository plugin - support scanning historical git commits #66

jossef opened this issue May 11, 2023 · 4 comments · Fixed by #88
Assignees
Labels
enhancement New feature or request

Comments

@jossef
Copy link
Member

jossef commented May 11, 2023

repository.go is quite straight-forward implementation.

suggesting to use Gitleaks engine to scan historical commits if they contain secrets

@jossef
Copy link
Member Author

jossef commented Jun 6, 2023

for simplicity, suggesting to scan the historical commits for the current checked out branch

@bryantschuck bryantschuck added the enhancement New feature or request label Jun 6, 2023
@baruchiro
Copy link
Contributor

suggesting to use Gitleaks engine to scan historical commits if they contain secrets

Just saying we are half-based on Gitleaks. Gitleaks is scanning Git history and finds secrets, but our code is based on Gitleaks only for the secrets themselves.

Here, your suggestion is just taking the git traversal mechanism from Gitleaks, because we can't take the whole process, we must pass the content into our secrets finder, to support secrets that doesn't exists in Gitleaks.

@baruchiro
Copy link
Contributor

baruchiro commented Jun 7, 2023

@jossef, @bryantschuck, @cx-monicac, @joaopedrocsilva I want to consult with you.

Using Gitleaks to search the history of the repository, gives me the following results:

All the commits in the current branch
For each commit:
    List of changed files.
    For each file:
        List of changed fragments

We can say that a specific fragment can be defined by {commit}_{filename}_{fragmentIndex}.

My question is, how should we define an Item for this structure, and what will be its Source and ID?
Do we want to collect all the fragments of a file to be in one Item for a file in commit?
Or each fragment should be identified alone?


I think, from the user's perspective, he didn't care about the fragment. He only needs to know where the secret was first introduced, so we need to give him the Commit and the filename.

The Item will be with Source: filename and ID: commitHash_filename.

WDYT?

@baruchiro
Copy link
Contributor

@bryantschuck Another question: Do we need to keep support scanning the current existing files in a directory, or we can replace it with scanning all the git history (which finally scans the whole content of the directory + historical changes that were removed over time)?

I think we should move the current implementation of "git" as a regular directory to another plugin for scanning the filesystem and not related to Git features.

jossef pushed a commit that referenced this issue Jun 12, 2023
- refactor: repository plugin initialization
- change repository to filesystem
- repository plugin - support scanning historical git commits
  Fixes #66
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants