forked from opea-project/GenAIComps
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix typos and add definitions for toxicity detection microservice (op…
…ea-project#553) * fix typos Signed-off-by: Tyler Wilbers <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Tyler Wilbers <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
068527d
commit 9b8798a
Showing
2 changed files
with
14 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,14 @@ | ||
# Toxicity Detection Microservice | ||
|
||
# ☣️💥🛡️<span style="color:royalblue"> Intel Toxicity Detection Model </span> | ||
# ☣️💥🛡️Toxicity Detection Microservice | ||
|
||
## Introduction | ||
|
||
Intel also provides toxicity detection model, which is lightweight, runs efficiently on a CPU, and performs well on toxic_chat and jigsaws datasets. More datasets are being fine-tuned. If you're interested, please contact [email protected]. | ||
Toxicity Detection Microservice allows AI Application developers to safeguard user input and LLM output from harmful language in a RAG environment. By leveraging a smaller fine-tuned Transformer model for toxicity classification (e.g. DistilledBERT, RoBERTa, etc.), we maintain a lightweight guardrails microservice without significantly sacrificing performance making it readily deployable on both Intel Gaudi and Xeon. | ||
|
||
Toxicity is defined as rude, disrespectful, or unreasonable language likely to make someone leave a conversation. This can include instances of aggression, bullying, targeted hate speech, or offensive language. For more information on labels see [Jigsaw Toxic Comment Classification Challenge](http://kaggle.com/c/jigsaw-toxic-comment-classification-challenge). | ||
|
||
## Training Customerizable Toxicity Model on Gaudi2 | ||
## Future Development | ||
|
||
Additionally, we offer a fine-tuning workflow on Intel Gaudi2, allowing you to customerize your toxicity detecction model to suit your unique needs. | ||
- Add a RoBERTa (125M params) toxicity model fine-tuned on Gaudi2 with ToxicChat and Jigsaw dataset in an optimized serving framework. | ||
|
||
# 🚀1. Start Microservice with Python(Option 1) | ||
|
||
|
@@ -24,7 +24,7 @@ pip install -r requirements.txt | |
python toxicity_detection.py | ||
``` | ||
|
||
# 🚀2. Start Microservie with Docker (Option 2) | ||
# 🚀2. Start Microservice with Docker (Option 2) | ||
|
||
## 2.1 Prepare toxicity detection model | ||
|
||
|
@@ -58,7 +58,7 @@ Once microservice starts, users can use examples (bash or python) below to apply | |
```bash | ||
curl localhost:9091/v1/toxicity | ||
-X POST | ||
-d '{"text":"How to poison your neighbor'\''s dog secretly"}' | ||
-d '{"text":"How to poison my neighbor'\''s dog without being caught?"}' | ||
-H 'Content-Type: application/json' | ||
``` | ||
|
||
|
@@ -76,7 +76,7 @@ import json | |
|
||
proxies = {"http": ""} | ||
url = "http://localhost:9091/v1/toxicity" | ||
data = {"text": "How to poison your neighbor'''s dog without being caught?"} | ||
data = {"text": "How to poison my neighbor'''s dog without being caught?"} | ||
|
||
try: | ||
resp = requests.post(url=url, data=data, proxies=proxies) | ||
|