-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New multitask 9in1 #207
Closed
Closed
New multitask 9in1 #207
Changes from all commits
Commits
Show all changes
171 commits
Select commit
Hold shift + click to select a range
bc5fd39
Merge pull request #1 from deeppavlov/dev
dimakarp1996 48e6581
Update utils.py
dimakarp1996 8972e01
Update utils.py
dimakarp1996 8c39b84
Update requirements.txt
dimakarp1996 572df57
Update Dockerfile
dimakarp1996 4dfacef
Update README.md
dimakarp1996 a13e7c4
Update test.py
dimakarp1996 4c945fe
Update combined_classifier.json
dimakarp1996 c1ed326
Update server.py
dimakarp1996 6cf60b8
Update utils.py
dimakarp1996 380eabf
Update universal_templates.py
dimakarp1996 05514a5
Update dev_requirements.txt
dimakarp1996 c85894b
Update test_data.json
dimakarp1996 7dadef8
Update requirements.txt
dimakarp1996 cc7d873
Update requirements.txt
dimakarp1996 25733a6
Update requirements.txt
dimakarp1996 3241e89
Update requirements.txt
dimakarp1996 e07ece7
Update templates.py
dimakarp1996 20eb2f5
Update requirements.txt
dimakarp1996 a1842cd
Update requirements.txt
dimakarp1996 40caace
Update test_dialog.json
dimakarp1996 1abf00c
Update data.json
dimakarp1996 586b56c
Update scenario.py
dimakarp1996 d89d10d
Update test.py
dimakarp1996 cb7fae9
Update tests.json
dimakarp1996 0928cad
Update requirements.txt
dimakarp1996 d70a84f
Update skill.py
dimakarp1996 abc710b
Update requirements.txt
dimakarp1996 f4abd04
Update test_no_annotations.json
dimakarp1996 14259e8
Update combined_classifier.json
dimakarp1996 c9bf511
Codestyle using BLACK
dimakarp1996 68f6167
Update utils.py
dimakarp1996 21bbed9
Update test.py
dimakarp1996 a0ce333
Update test.py
dimakarp1996 fff4d77
Update server.py
dimakarp1996 cc4b381
Update test.py
dimakarp1996 f06c637
Update test.py
dimakarp1996 f68b7fb
Update test.py
dimakarp1996 93e8048
Update test.py
dimakarp1996 ea2f422
Update test.py
dimakarp1996 769623a
Update test.py
dimakarp1996 80ca056
Update test.py
dimakarp1996 48de78d
Update test.py
dimakarp1996 32fbf21
Update test.py
dimakarp1996 6d56299
Update server.py
dimakarp1996 60bca11
Update test.py
dimakarp1996 b804824
Update test.py
dimakarp1996 ac36219
Update test.py
dimakarp1996 0081a26
codestyle
e020aa2
Update utils.py
dimakarp1996 8e62af7
Renamed topic_classification, deleted unnesessary string
dimakarp1996 dd68e6b
Speeded up the combined classifier
bfb9f91
Update Dockerfile
dimakarp1996 f9ea06b
New version of DeepPavlov
9324ff5
Clean new combined - with fixed bug in checkout
cf1487a
Update README.md
dimakarp1996 01071a8
Further speeded up multitask BERT model
1751c02
Update Dockerfile
dimakarp1996 97aca70
I have done my best to speed up the multitask inference.
dimakarp1996 ef138ce
Update utils.py
dimakarp1996 68a716b
DeepPavlov version after several fixes. Also, new distil model ( not …
dimakarp1996 e5a50d0
hh
058e50d
Tests fixed
7c1e6c7
Merge branch 'new_multitask_9in1' into new_multitask_9in1_tmp
dimakarp1996 7f44802
Merge pull request #2 from dimakarp1996/new_multitask_9in1_tmp
dimakarp1996 8fc2598
Update test.py
dimakarp1996 7c1517a
codestyle
2f318bf
Returned cuda cache
dimakarp1996 9b8268e
integrate new commit
30987fc
Update Dockerfile
dimakarp1996 77d70dd
integrate new commit
486a432
Test change for memory profiling
dimakarp1996 7568eca
It should work much faster now
26adf16
It should work much faster now
f22a81e
It should work much faster now
a5549f8
It should work much faster now
d510ff2
Test editings to tackle test_dialog fail
2240724
Update server.py
dimakarp1996 621e9a2
Update combined_classifier.json
dimakarp1996 ad4b8a2
Merge pull request #3 from dimakarp1996/new_multitask_9in1_2
dimakarp1996 5850bf1
codestyle
02103ab
Update Dockerfile
dimakarp1996 5ab3110
Update combined_classifier.json
dimakarp1996 13f4840
Current test-passing version
b4d549c
Changed factoid criteria & postprocess for cobot topics and intents
2dbd218
Changed factoid criteria & postprocess for cobot topics and intents
579fdeb
Minor test fix - updated "random skills" list
0d1d0c6
codestyle
efbefe5
codestyle
7be7287
Update factoid.py
dimakarp1996 bebc9f5
Update connector.py
dimakarp1996 ab73f33
Utilize unified prob threshold in factoid skill.
dimakarp1996 4e4e9e5
Dilya's suggestion
dimakarp1996 f1e6f96
Dilya's suggestion
dimakarp1996 b3c9a35
Dilya's suggestions
dimakarp1996 80c074a
Dilya's suggestion
dimakarp1996 d67d998
Dilya's suggestion
dimakarp1996 917e34e
Dilya's comment
dimakarp1996 a7a1c04
Update Dockerfile
dimakarp1996 a6cb4f9
Update Dockerfile
dimakarp1996 3d26df4
Update combined_classifier.json
dimakarp1996 70238eb
Update README.md
dimakarp1996 fc18528
current changes
294ae09
Merge pull request #4 from dimakarp1996/new_multitask_9in1_tmp2
dimakarp1996 144527d
Codestyle
dimakarp1996 6892115
Added dependency to fix bug https://github.com/tiangolo/typer/issues/377
48cd44c
Merge pull request #5 from deeppavlov/dev
dimakarp1996 fca08d9
merge dev
dimakarp1996 c03d064
merge dev
dimakarp1996 3ac77c4
Update requirements.txt
dimakarp1996 f362cc7
Update requirements.txt
dimakarp1996 2ac7858
Update requirements.txt
dimakarp1996 14db758
Update requirements.txt
dimakarp1996 02adf0b
Update requirements.txt
dimakarp1996 41484a8
Update requirements.txt
dimakarp1996 65fbc98
Update requirements.txt
dimakarp1996 d7b713e
Update requirements.txt
dimakarp1996 cd73ef6
Update requirements.txt
dimakarp1996 8182a8e
Update requirements.txt
dimakarp1996 1478f0a
Update requirements.txt
dimakarp1996 c931578
Update requirements.txt
dimakarp1996 933dce2
Update requirements.txt
dimakarp1996 8b2e94a
Update requirements.txt
dimakarp1996 bc74828
Update requirements.txt
dimakarp1996 eb7981f
Update requirements.txt
dimakarp1996 4513d3f
Update requirements.txt
dimakarp1996 d3d1592
Update requirements.txt
dimakarp1996 b26466c
Update requirements.txt
dimakarp1996 8b33e45
Update requirements.txt
dimakarp1996 4367863
Fix broken dependencies
dimakarp1996 449be70
Update Dockerfile
dimakarp1996 d50472c
Update dev.yml
dimakarp1996 8bbdab7
Update combined_classifier.json
dimakarp1996 484defd
Update requirements.txt
dimakarp1996 d4ade70
Update requirements.txt
dimakarp1996 a4d9fcb
Update requirements.txt
dimakarp1996 4be6db4
Update requirements.txt
dimakarp1996 80d0ee0
Update requirements.txt
dimakarp1996 824d11d
Update requirements.txt
dimakarp1996 7a808df
Update requirements.txt
dimakarp1996 83c7e86
Update requirements.txt
dimakarp1996 6f880be
Update requirements.txt
dimakarp1996 a4e00e6
Update requirements.txt
dimakarp1996 36132bd
Update requirements.txt
dimakarp1996 966ccc1
Update requirements.txt
dimakarp1996 553051c
Update requirements.txt
dimakarp1996 67ac8a3
Update requirements.txt
dimakarp1996 4f5407d
Update requirements.txt
dimakarp1996 e0bff37
Added factoid threshold
dimakarp1996 d553ceb
Update connector.py
dimakarp1996 4a1b4af
Update server.py
dimakarp1996 7273818
Addressed Dilya's comments. Not tested yet
dimakarp1996 0cadb85
Codestyle
dimakarp1996 24c4db8
Codestyle
dimakarp1996 9b4bd4a
Suggested changes
6ffcd46
current version
9e454a0
Fixed sentence len
d34a342
Added setuptools dependency while numpy 1.18.0 not to fail on build
9c90a19
Merge pull request #9 from deeppavlov/dev
dimakarp1996 765ab40
Added setuptools dependency while numpy 1.18.0 not to fail on build
ab40e4e
Merge branch 'new_multitask_9in1' of https://github.com/dimakarp1996/…
ade5e0a
Still facing bug https://github.com/numpy/numpy/issues/22623 - restri…
3dcd471
h
b48785e
Try to fix bug in test_dialog in utils/analyze_downloads.py while imp…
dimakarp1996 6ae06bd
Merge branch 'dev' into new_multitask_9in1
dimakarp1996 bf0772e
Update utils.py
dimakarp1996 f60965c
Update utils.py
dimakarp1996 2803226
Threshold fixes as siggested by Dilya
0bca42b
Different thresholds for dp topics as suggested by Dilya
dimakarp1996 caf17fd
Cosmetic change
8acf2a3
Merge branch 'dev' into new_multitask_9in1
dimakarp1996 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,4 +6,4 @@ sentry-sdk==0.12.3 | |
spacy==3.0.5 | ||
click==7.1.2 | ||
jinja2<=3.0.3 | ||
Werkzeug<=2.0.3 | ||
Werkzeug<=2.0.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,4 +7,4 @@ spacy==3.0.5 | |
click==7.1.2 | ||
pymorphy2==0.9.1 | ||
jinja2<=3.0.3 | ||
Werkzeug<=2.0.3 | ||
Werkzeug<=2.0.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,8 @@ | ||
FROM deeppavlov/base-gpu:0.12.1 | ||
RUN pip install git+https://github.com/deeppavlov/[email protected] | ||
FROM deeppavlov/base-gpu:0.17.5 | ||
|
||
#RUN rm DeepPavlov | ||
RUN pip install git+https://github.com/deeppavlov/DeepPavlov.git@a53c42062e4bccf6ec63021ec6bd7b9fbe23f091 | ||
|
||
#Set up git lfs for your user account: git lfs install | ||
WORKDIR /base | ||
RUN rm -rf DeepPavlov | ||
RUN git clone https://github.com/dimakarp1996/DeepPavlov.git | ||
WORKDIR /base/DeepPavlov | ||
RUN git checkout pal-bert+ner | ||
|
||
ARG CONFIG | ||
|
||
|
@@ -21,9 +15,7 @@ RUN mkdir common | |
|
||
COPY annotators/combined_classification/ ./ | ||
COPY common/ common/ | ||
RUN ls /tmp | ||
|
||
RUN pip install -r requirements.txt | ||
ARG DATA_URL=http://files.deeppavlov.ai/alexaprize_data/pal_bert_7in1/model.pth.tar | ||
ADD $DATA_URL /tmp | ||
|
||
CMD gunicorn --workers=1 --bind 0.0.0.0:8087 --timeout=300 server:app |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,22 @@ | ||
BERT Base model for 6 tasks - cobot topics cobot dialogact topics cobot dialogact intent emotion sentiment toxic | ||
This model is based on the transformer-agnostic multitask neural architecture. It can solve several tasks similtaneously, almost as good as single-task models. | ||
|
||
The models were trained on the following datasets: | ||
|
||
**Factoid classification** : For the Factoid task, we used the same Yahoo ConversVsInfo dataset that was used to train the Dream socialbot in Alexa Prize . Note that the valid set in this task was equal to the test set. | ||
|
||
**Midas classification** : For the Midas task, we used the same Midas classification dataset that was used to train the Dream socialbot in Alexa Prize . Note that the valid set in this task was equal to the test set. | ||
|
||
**Emotion classification** :For the Emotion classification task, we used the emo\_go\_emotions dataset, with all the 28 classes compressed into the seven basic emotions as in the original paper. Note that these 7 emotions are not exactly the same as the 7 emotions in the original Dream socialbot in Alexa Prize: 1 emotion differs (love VS disgust), so the scores are incomparable with the original model. Note that this task is multiclass. | ||
|
||
**Topic classification**: For the Topic classification task, we used the dataset made by Dilyara Zharikova. The dataset was further filtered and improved for the final model version, to make the model suitable for DREAM. Note that the original topics model doesn’t account for that dataset changes(which were also about class number) and thus its scores are not compatible with the scores we have. | ||
|
||
**Sentiment classification** : For the Sentiment classification task, we used the Dynabench dataset (r1 + r2). | ||
|
||
**Toxic classification** : For the toxic classification task, we used the dataset from kaggle <https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/datawith> the 7 toxic classes that pose an interest to us. Note that this task is multilabel. | ||
|
||
The model also contains 3 replacement models for Amazon services. | ||
|
||
The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models. | ||
|
||
This model(with a distilbert-base-uncased backbone) takes only 2439 Mb for 9 tasks, whereas single-task models with the same backbone for every of these tasks take up almost the same memory(~2437 Mb for every of these 9 tasks). | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
это точно будет выглядеть некрасиво - полотном.сделай заголовок и поинты - с помощью *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
сделано