Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New field added client-secret for Azure client secret ID. #555

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/gh_pages/docs/entityclassifier.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Below is the list of `entities` supported by Pebblo -
1. RSA Private Key
1. Google Account Private Key
1. Github Fine Grained Token
1. Azure Client Secret Key


User can get details of classified entities for their loader source files in Pebblo report.
Expand Down
1 change: 1 addition & 0 deletions pebblo/entity_classifier/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ And following Secret Entities:
10. RSA Private Key
11. Google Account Private Key
12. Github Fine Grained Token
13. Azure Client Secret Key

## How to use
Entity Classifier
Expand Down
2 changes: 1 addition & 1 deletion pebblo/entity_classifier/utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"aws-access-key": ["aws_access_key", "aws_key", "access", "id", "api"],
"aws-secret-key": ["aws_secret_key", "secret"],
"azure-key-id": ["azure_key", "azure_key_id", "azure_id", "key"],
"azure-client-secret": ["azure_client_secret", "client", "secret"],
"azure-client-secret": ["azure_client_secret", "client-secret", "client_secret"],
"google-api-key": ["google_api_key", "google_key", "google"],
}

Expand Down
2 changes: 1 addition & 1 deletion pebblo/entity_classifier/utils/regex_pattern.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,6 @@
"aws-access-key": r"""\b((?:AKIA|ABIA|ACCA|ASIA)[0-9A-Z]{16})\b""",
"aws-secret-key": r"""\b([A-Za-z0-9+/]{40})[ \r\n'"\x60]""",
"azure-key-id": r"""(?i)(%s).{0,20}([a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12})""",
"azure-client-secret": r"""\b(?i)(%s).{0,20}([a-z0-9_\.\-~]{34})\b""",
"azure-client-secret": r"""\b[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}\b""",
"google-api-key": r"""\bAIza[0-9A-Za-z\-_]{35}\b""",
}
35 changes: 17 additions & 18 deletions tests/entity_classifier/mock_response.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,27 @@
mock_input_text1_anonymize_snippet_true = """
<PERSON>'s SSN is <US_SSN>.
Sachin's SSN is <US_SSN>.
ITIN number <US_ITIN>
His AWS Access Key is: <AWS_ACCESS_KEY>.
And <PERSON> is: <GITHUB_TOKEN>
And Github Token is: <GITHUB_TOKEN>
"""

mock_input_text2_anonymize_snippet_true = """
Content
"<PERSON> board on <DATE_TIME> announced an interim dividend of Re 1 per equity share of the face value of Rs 2 each, i.e., a 50 per cent payout for <DATE_TIME> along with financial results for the <DATE_TIME> period of the company for <DATE_TIME>."
"<PERSON> reminded the board of the scheduled retreat coming up in <DATE_TIME>, and provided a drafted retreat schedule. The board provided feedback on the agenda and the consensus was that, outside of making a few minor changes, the committee should move forward as planned. No board action required."
"Wipros board on Friday, January 12 announced an interim dividend of Re 1 per equity share of the face value of Rs 2 each, i.e., a 50 per cent payout for the current financial year along with financial results for the October-December period of the company for the financial year ending March 2024."
"Roberts reminded the board of the scheduled retreat coming up in three months, and provided a drafted retreat schedule. The board provided feedback on the agenda and the consensus was that, outside of making a few minor changes, the committee should move forward as planned. No board action required."
"Claims: An adaptive pacing system for implantable cardiac devices, comprising a pulse generator, multiple sensing electrodes, a microprocessor-based control unit, a wireless communication module, and memory for dynamically adjusting pacing parameters based on real-time physiological data. The system of claim 1, wherein the adaptive pacing algorithms include rate-responsive pacing based on physical activity. The system of claim 1, further comprising an external monitoring system for remote data access and modification of pacing parameters."
"<PERSON>'s SSN is <US_SSN>. His passport ID is 5484880UA.
<PERSON>'s driver's license number is <NRP>.
<PERSON>'s bank account number is 70048841700216300.
His <NRP> express credit card number is <CREDIT_CARD>.
His UK IBAN Code is <IBAN_CODE>.
ITIN number <US_ITIN>.
Azure client secret : c4cb6f91-15a7-4e6d-a824-abcdef012345.
AWS Access Key is: <AWS_ACCESS_KEY>
AWS Secret Key is : <AWS_SECRET_KEY>
Github Token is: <GITHUB_TOKEN>
Google API key: <PERSON><PERSON> is: <SLACK_TOKEN>
Azure Client Secret - c4cb6f91-15a7-4e6d-a824-abcdef012345
<PERSON> - <SLACK_TOKEN>
"Sachin's SSN is <US_SSN>. His passport ID is 5484880UA.
Sachin's driver's license number is <US_DRIVER_LICENSE>.
Sachin's bank account number is <US_BANK_NUMBER>.
His American express credit card number is <CREDIT_CARD>.
His UK IBAN Code is <IBAN_CODE>.
ITIN number <US_ITIN>.
AWS Access Key is: <AWS_ACCESS_KEY>
AWS Secret Key is : <AWS_SECRET_KEY>Github Token is: <GITHUB_TOKEN>
Google API key: zaCELgL0imfnc8mVLWwsAawjYr4Rx-Af50DDqtlx
Slack Token is: <SLACK_TOKEN>
Slack Token - <SLACK_TOKEN>
Google API key- KLzaSyB_tWrbmfWx8g2bzL7Vhq7znuTUn0JPKmY"
IP Address - <IP_ADDRESS>
My IP Address - <IP_ADDRESS>
Azure client_secret is <AZURE_CLIENT_SECRET>
"""
55 changes: 40 additions & 15 deletions tests/entity_classifier/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,51 @@
"Wipros board on Friday, January 12 announced an interim dividend of Re 1 per equity share of the face value of Rs 2 each, i.e., a 50 per cent payout for the current financial year along with financial results for the October-December period of the company for the financial year ending March 2024."
"Roberts reminded the board of the scheduled retreat coming up in three months, and provided a drafted retreat schedule. The board provided feedback on the agenda and the consensus was that, outside of making a few minor changes, the committee should move forward as planned. No board action required."
"Claims: An adaptive pacing system for implantable cardiac devices, comprising a pulse generator, multiple sensing electrodes, a microprocessor-based control unit, a wireless communication module, and memory for dynamically adjusting pacing parameters based on real-time physiological data. The system of claim 1, wherein the adaptive pacing algorithms include rate-responsive pacing based on physical activity. The system of claim 1, further comprising an external monitoring system for remote data access and modification of pacing parameters."
"Sachin's SSN is 222-85-4836. His passport ID is 5484880UA.
Sachin's driver's license number is S9998888.
Sachin's bank account number is 70048841700216300.
His American express credit card number is 371449635398431.
His UK IBAN Code is AZ96AZEJ00000000001234567890.
ITIN number 993-77 0690.
Azure client secret : c4cb6f91-15a7-4e6d-a824-abcdef012345.
AWS Access Key is: AKIAQIPT4PDORIRTV6PH
AWS Secret Key is : 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
Github Token is: ghp_hgu657yiujgwfrtigu3ver238765tyuhygvtrder6t7gyvhbuy5e676578976tyghy76578uygfyfgcyturtdf
Google API key: zaCELgL0imfnc8mVLWwsAawjYr4Rx-Af50DDqtlx
Slack Token is: xoxp-7676545380258-uygh
Azure Client Secret - c4cb6f91-15a7-4e6d-a824-abcdef012345
Slack Token - xoxb-3204014939555-4519358291237-TTIf0243T8YFSAGEVr1wBrWE
"Sachin's SSN is 222-85-4836. His passport ID is 5484880UA.
Sachin's driver's license number is S9998888.
Sachin's bank account number is 70048841700216300.
His American express credit card number is 371449635398431.
His UK IBAN Code is AZ96AZEJ00000000001234567890.
ITIN number 993-77 0690.
AWS Access Key is: AKIAQIPT4PDORIRTV6PH
AWS Secret Key is : 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
Github Token is: ghp_hgu657yiujgwfrtigu3ver238765tyuhygvtrder6t7gyvhbuy5e676578976tyghy76578uygfyfgcyturtdf
Google API key: zaCELgL0imfnc8mVLWwsAawjYr4Rx-Af50DDqtlx
Slack Token is: xoxp-7676545380258-uygh
Slack Token - xoxb-3204014939555-4519358291237-TTIf0243T8YFSAGEVr1wBrWE
Google API key- KLzaSyB_tWrbmfWx8g2bzL7Vhq7znuTUn0JPKmY"
My IP Address - 10.55.60.61
My IP Address - 10.55.60.61
Azure client_secret is de1d4a2d-d9fa-44f1-84bb-4f73c004afda
"""

negative_data = """
Sachin's SSN is 222-85.
His AWS Access Key is: AKIPT4PDORIRTV6PH.
And Github Token is: ghpu657yiujgwfrtigu3ver238765tyuhygvtrder6t7gyvhbuy5e676578976tyghy76578uygfyfgcyturtdf
"""

tf_test_data = """
variable "client_secret" {
}

# We strongly recommend using the required_providers block to set the
# Azure Provider source and version being used
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.x"
}
}
}

# Configure the Microsoft Azure Provider
provider "azurerm" {
features {}

client_id = "00000000-0000-0000-0000-000000000000"
client_secret = "1131a1fc-8cee-4f3c-9b2f-6808f66f72a4"
tenant_id = "10000000-0000-0000-0000-000000000000"
subscription_id = "20000000-0000-0000-0000-000000000000"
}
"""
Loading