Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New multitask 9in1 (Singlelabel version) #213

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
271 commits
Select commit Hold shift + click to select a range
1abf00c
Update data.json
dimakarp1996 Oct 10, 2022
586b56c
Update scenario.py
dimakarp1996 Oct 10, 2022
d89d10d
Update test.py
dimakarp1996 Oct 10, 2022
cb7fae9
Update tests.json
dimakarp1996 Oct 10, 2022
0928cad
Update requirements.txt
dimakarp1996 Oct 10, 2022
d70a84f
Update skill.py
dimakarp1996 Oct 10, 2022
abc710b
Update requirements.txt
dimakarp1996 Oct 10, 2022
f4abd04
Update test_no_annotations.json
dimakarp1996 Oct 10, 2022
14259e8
Update combined_classifier.json
dimakarp1996 Oct 10, 2022
c9bf511
Codestyle using BLACK
dimakarp1996 Oct 10, 2022
68f6167
Update utils.py
dimakarp1996 Oct 10, 2022
21bbed9
Update test.py
dimakarp1996 Oct 10, 2022
a0ce333
Update test.py
dimakarp1996 Oct 10, 2022
fff4d77
Update server.py
dimakarp1996 Oct 10, 2022
cc4b381
Update test.py
dimakarp1996 Oct 10, 2022
f06c637
Update test.py
dimakarp1996 Oct 10, 2022
f68b7fb
Update test.py
dimakarp1996 Oct 10, 2022
93e8048
Update test.py
dimakarp1996 Oct 10, 2022
ea2f422
Update test.py
dimakarp1996 Oct 10, 2022
769623a
Update test.py
dimakarp1996 Oct 10, 2022
80ca056
Update test.py
dimakarp1996 Oct 10, 2022
48de78d
Update test.py
dimakarp1996 Oct 10, 2022
32fbf21
Update test.py
dimakarp1996 Oct 10, 2022
6d56299
Update server.py
dimakarp1996 Oct 10, 2022
60bca11
Update test.py
dimakarp1996 Oct 10, 2022
b804824
Update test.py
dimakarp1996 Oct 10, 2022
ac36219
Update test.py
dimakarp1996 Oct 10, 2022
0081a26
codestyle
Oct 10, 2022
e020aa2
Update utils.py
dimakarp1996 Oct 10, 2022
8e62af7
Renamed topic_classification, deleted unnesessary string
dimakarp1996 Oct 11, 2022
dd68e6b
Speeded up the combined classifier
Oct 11, 2022
bfb9f91
Update Dockerfile
dimakarp1996 Oct 11, 2022
f9ea06b
New version of DeepPavlov
Oct 11, 2022
9324ff5
Clean new combined - with fixed bug in checkout
Oct 11, 2022
cf1487a
Update README.md
dimakarp1996 Oct 11, 2022
01071a8
Further speeded up multitask BERT model
Oct 11, 2022
1751c02
Update Dockerfile
dimakarp1996 Oct 11, 2022
97aca70
I have done my best to speed up the multitask inference.
dimakarp1996 Oct 11, 2022
ef138ce
Update utils.py
dimakarp1996 Oct 26, 2022
68a716b
DeepPavlov version after several fixes. Also, new distil model ( not …
dimakarp1996 Oct 30, 2022
e5a50d0
hh
Oct 30, 2022
058e50d
Tests fixed
Oct 31, 2022
7c1e6c7
Merge branch 'new_multitask_9in1' into new_multitask_9in1_tmp
dimakarp1996 Oct 31, 2022
7f44802
Merge pull request #2 from dimakarp1996/new_multitask_9in1_tmp
dimakarp1996 Oct 31, 2022
8fc2598
Update test.py
dimakarp1996 Oct 31, 2022
7c1517a
codestyle
Oct 31, 2022
2f318bf
Returned cuda cache
dimakarp1996 Oct 31, 2022
9b8268e
integrate new commit
Oct 31, 2022
30987fc
Update Dockerfile
dimakarp1996 Oct 31, 2022
77d70dd
integrate new commit
Oct 31, 2022
486a432
Test change for memory profiling
dimakarp1996 Oct 31, 2022
7568eca
It should work much faster now
Oct 31, 2022
26adf16
It should work much faster now
Oct 31, 2022
f22a81e
It should work much faster now
Oct 31, 2022
a5549f8
It should work much faster now
Oct 31, 2022
d510ff2
Test editings to tackle test_dialog fail
Nov 1, 2022
2240724
Update server.py
dimakarp1996 Nov 2, 2022
621e9a2
Update combined_classifier.json
dimakarp1996 Nov 2, 2022
ad4b8a2
Merge pull request #3 from dimakarp1996/new_multitask_9in1_2
dimakarp1996 Nov 2, 2022
5850bf1
codestyle
Nov 2, 2022
02103ab
Update Dockerfile
dimakarp1996 Nov 2, 2022
5ab3110
Update combined_classifier.json
dimakarp1996 Nov 2, 2022
13f4840
Current test-passing version
Nov 3, 2022
b4d549c
Changed factoid criteria & postprocess for cobot topics and intents
Nov 3, 2022
2dbd218
Changed factoid criteria & postprocess for cobot topics and intents
Nov 3, 2022
579fdeb
Minor test fix - updated "random skills" list
Nov 3, 2022
0d1d0c6
codestyle
Nov 3, 2022
efbefe5
codestyle
Nov 3, 2022
7be7287
Update factoid.py
dimakarp1996 Nov 3, 2022
bebc9f5
Update connector.py
dimakarp1996 Nov 3, 2022
ab73f33
Utilize unified prob threshold in factoid skill.
dimakarp1996 Nov 3, 2022
4e4e9e5
Dilya's suggestion
dimakarp1996 Nov 8, 2022
f1e6f96
Dilya's suggestion
dimakarp1996 Nov 8, 2022
b3c9a35
Dilya's suggestions
dimakarp1996 Nov 8, 2022
80c074a
Dilya's suggestion
dimakarp1996 Nov 8, 2022
d67d998
Dilya's suggestion
dimakarp1996 Nov 8, 2022
917e34e
Dilya's comment
dimakarp1996 Nov 8, 2022
a7a1c04
Update Dockerfile
dimakarp1996 Nov 8, 2022
a6cb4f9
Update Dockerfile
dimakarp1996 Nov 8, 2022
3d26df4
Update combined_classifier.json
dimakarp1996 Nov 8, 2022
70238eb
Update README.md
dimakarp1996 Nov 8, 2022
fc18528
current changes
Nov 8, 2022
294ae09
Merge pull request #4 from dimakarp1996/new_multitask_9in1_tmp2
dimakarp1996 Nov 8, 2022
144527d
Codestyle
dimakarp1996 Nov 8, 2022
6892115
Added dependency to fix bug https://github.com/tiangolo/typer/issues/377
Nov 9, 2022
48cd44c
Merge pull request #5 from deeppavlov/dev
dimakarp1996 Nov 14, 2022
fca08d9
merge dev
dimakarp1996 Nov 14, 2022
c03d064
merge dev
dimakarp1996 Nov 14, 2022
3ac77c4
Update requirements.txt
dimakarp1996 Nov 15, 2022
f362cc7
Update requirements.txt
dimakarp1996 Nov 15, 2022
2ac7858
Update requirements.txt
dimakarp1996 Nov 15, 2022
14db758
Update requirements.txt
dimakarp1996 Nov 15, 2022
02adf0b
Update requirements.txt
dimakarp1996 Nov 15, 2022
41484a8
Update requirements.txt
dimakarp1996 Nov 15, 2022
65fbc98
Update requirements.txt
dimakarp1996 Nov 15, 2022
d7b713e
Update requirements.txt
dimakarp1996 Nov 15, 2022
cd73ef6
Update requirements.txt
dimakarp1996 Nov 15, 2022
8182a8e
Update requirements.txt
dimakarp1996 Nov 15, 2022
1478f0a
Update requirements.txt
dimakarp1996 Nov 15, 2022
c931578
Update requirements.txt
dimakarp1996 Nov 15, 2022
933dce2
Update requirements.txt
dimakarp1996 Nov 15, 2022
8b2e94a
Update requirements.txt
dimakarp1996 Nov 15, 2022
bc74828
Update requirements.txt
dimakarp1996 Nov 15, 2022
eb7981f
Update requirements.txt
dimakarp1996 Nov 15, 2022
4513d3f
Update requirements.txt
dimakarp1996 Nov 15, 2022
d3d1592
Update requirements.txt
dimakarp1996 Nov 15, 2022
b26466c
Update requirements.txt
dimakarp1996 Nov 15, 2022
8b33e45
Update requirements.txt
dimakarp1996 Nov 15, 2022
4367863
Fix broken dependencies
dimakarp1996 Nov 15, 2022
b558434
Update combined_classifier.json
dimakarp1996 Nov 15, 2022
5cdf0eb
Update test.py
dimakarp1996 Nov 15, 2022
5c084df
Update server.py
dimakarp1996 Nov 15, 2022
97423e1
Update combined_classifier.json
dimakarp1996 Nov 15, 2022
d9c1f4f
Update server.py
dimakarp1996 Nov 15, 2022
96fd10c
Update test.py
dimakarp1996 Nov 15, 2022
32bb7ce
Passes skill tests now
Nov 16, 2022
99dfbaf
Merge pull request #7 from deeppavlov/dev
dimakarp1996 Nov 16, 2022
17d78bf
Update Dockerfile
dimakarp1996 Nov 16, 2022
2227716
Update Dockerfile
dimakarp1996 Nov 16, 2022
8635717
Update Dockerfile
dimakarp1996 Nov 16, 2022
86c5c9f
Update Dockerfile
dimakarp1996 Nov 16, 2022
01011df
Update Dockerfile
dimakarp1996 Nov 16, 2022
9adb374
Update server.py
dimakarp1996 Nov 16, 2022
183953a
Fix bug
Nov 16, 2022
449be70
Update Dockerfile
dimakarp1996 Nov 21, 2022
d50472c
Update dev.yml
dimakarp1996 Nov 21, 2022
8bbdab7
Update combined_classifier.json
dimakarp1996 Nov 21, 2022
484defd
Update requirements.txt
dimakarp1996 Nov 21, 2022
d4ade70
Update requirements.txt
dimakarp1996 Nov 21, 2022
a4d9fcb
Update requirements.txt
dimakarp1996 Nov 21, 2022
4be6db4
Update requirements.txt
dimakarp1996 Nov 21, 2022
80d0ee0
Update requirements.txt
dimakarp1996 Nov 21, 2022
824d11d
Update requirements.txt
dimakarp1996 Nov 21, 2022
7a808df
Update requirements.txt
dimakarp1996 Nov 21, 2022
83c7e86
Update requirements.txt
dimakarp1996 Nov 21, 2022
6f880be
Update requirements.txt
dimakarp1996 Nov 21, 2022
a4e00e6
Update requirements.txt
dimakarp1996 Nov 21, 2022
36132bd
Update requirements.txt
dimakarp1996 Nov 21, 2022
966ccc1
Update requirements.txt
dimakarp1996 Nov 21, 2022
553051c
Update requirements.txt
dimakarp1996 Nov 21, 2022
67ac8a3
Update requirements.txt
dimakarp1996 Nov 21, 2022
4f5407d
Update requirements.txt
dimakarp1996 Nov 21, 2022
e0bff37
Added factoid threshold
dimakarp1996 Nov 21, 2022
d553ceb
Update connector.py
dimakarp1996 Nov 21, 2022
4a1b4af
Update server.py
dimakarp1996 Nov 21, 2022
7273818
Addressed Dilya's comments. Not tested yet
dimakarp1996 Nov 21, 2022
0cadb85
Codestyle
dimakarp1996 Nov 21, 2022
24c4db8
Codestyle
dimakarp1996 Nov 21, 2022
9b4bd4a
Suggested changes
Nov 21, 2022
6ffcd46
current version
Nov 22, 2022
9e454a0
Fixed sentence len
Nov 22, 2022
d34a342
Added setuptools dependency while numpy 1.18.0 not to fail on build
Nov 22, 2022
9c90a19
Merge pull request #9 from deeppavlov/dev
dimakarp1996 Nov 22, 2022
765ab40
Added setuptools dependency while numpy 1.18.0 not to fail on build
Nov 22, 2022
ab40e4e
Merge branch 'new_multitask_9in1' of https://github.com/dimakarp1996/…
Nov 22, 2022
ade5e0a
Still facing bug https://github.com/numpy/numpy/issues/22623 - restri…
Nov 22, 2022
3dcd471
h
Nov 22, 2022
b48785e
Try to fix bug in test_dialog in utils/analyze_downloads.py while imp…
dimakarp1996 Nov 22, 2022
6ae06bd
Merge branch 'dev' into new_multitask_9in1
dimakarp1996 Nov 29, 2022
bf0772e
Update utils.py
dimakarp1996 Nov 30, 2022
f60965c
Update utils.py
dimakarp1996 Nov 30, 2022
2803226
Threshold fixes as siggested by Dilya
Nov 30, 2022
0bca42b
Different thresholds for dp topics as suggested by Dilya
dimakarp1996 Nov 30, 2022
caf17fd
Cosmetic change
Nov 30, 2022
8acf2a3
Merge branch 'dev' into new_multitask_9in1
dimakarp1996 Nov 30, 2022
1dbfd5c
Merge branch 'new_multitask_9in1' into new_multitask_9in1_singlelabel
dimakarp1996 Nov 30, 2022
1f21fa5
Update combined_classifier.json
dimakarp1996 Nov 30, 2022
db4275c
fixed tests
Nov 30, 2022
5bd2631
Merge branch 'new_multitask_9in1_singlelabel' of https://github.com/d…
Nov 30, 2022
72ef3ab
codestyle
Nov 30, 2022
6eef89c
Update utils.py
dimakarp1996 Dec 1, 2022
ac94a77
Merge pull request #12 from deeppavlov/dev
dimakarp1996 Dec 5, 2022
8c29808
Suggested phrase fix
dimakarp1996 Dec 5, 2022
a9923c4
Update dev.yml
dimakarp1996 Dec 5, 2022
8435e3b
Remove empty requirements changes
Dec 5, 2022
6f5f344
requirements fix
Dec 5, 2022
5f9fece
Update connector.py
dimakarp1996 Dec 5, 2022
a485721
Revert prev commit
dimakarp1996 Dec 6, 2022
fd078b5
Merge branch 'dev' into new_multitask_9in1_singlelabel
dimakarp1996 Dec 9, 2022
671fdf7
Names as in original
dimakarp1996 Dec 13, 2022
37715d9
added cache
dimakarp1996 Dec 13, 2022
5e4569f
Returned names to the original ones. Also added cache to the combined…
dimakarp1996 Dec 13, 2022
a3c754f
Update utils.py
dimakarp1996 Dec 13, 2022
8b16a22
Update utils.py
dimakarp1996 Dec 13, 2022
e51441e
Update animals.py
dimakarp1996 Dec 13, 2022
8d626bb
Update art.py
dimakarp1996 Dec 13, 2022
07cf492
Update books.py
dimakarp1996 Dec 13, 2022
0cf6b48
Update food.py
dimakarp1996 Dec 13, 2022
6e6bb0d
Update gaming.py
dimakarp1996 Dec 13, 2022
5e8d6c7
Update gaming.py
dimakarp1996 Dec 13, 2022
f397f70
Update music.py
dimakarp1996 Dec 13, 2022
b2e0824
Update art.py
dimakarp1996 Dec 13, 2022
924e7fb
Update travel.py
dimakarp1996 Dec 13, 2022
b0df285
Update science.py
dimakarp1996 Dec 13, 2022
e2a12d1
Update books.py
dimakarp1996 Dec 13, 2022
0224eda
Update food.py
dimakarp1996 Dec 13, 2022
a998777
Update gaming.py
dimakarp1996 Dec 13, 2022
2463025
Update gossip.py
dimakarp1996 Dec 13, 2022
04657b0
Update sport.py
dimakarp1996 Dec 13, 2022
2bc633b
Update sport.py
dimakarp1996 Dec 13, 2022
5f4d576
Update music.py
dimakarp1996 Dec 13, 2022
8567fc6
Update movies.py
dimakarp1996 Dec 13, 2022
a9c61d4
codestyle
Dec 13, 2022
7f377cd
Added functions for different topics to allow their quick finding by …
dimakarp1996 Dec 13, 2022
66547e9
Merge pull request #15 from deeppavlov/dev
dimakarp1996 Dec 13, 2022
4cfd0f7
codestyle
dimakarp1996 Dec 13, 2022
22d10e3
Merge branch 'dev' into new_multitask_9in1_singlelabel
dimakarp1996 Dec 16, 2022
296c18f
Classname fix in tests
dimakarp1996 Dec 19, 2022
0409dbb
suggested changes
Dec 20, 2022
82dbe7c
Merge branch 'new_multitask_9in1_singlelabel' of https://github.com/d…
Dec 20, 2022
bc662a6
Merge branch 'dev' of https://github.com/dimakarp1996/dream into new_…
Dec 20, 2022
68c2fbe
Merge pull request #16 from deeppavlov/dev
dimakarp1996 Dec 20, 2022
471d592
Merge branch 'dev' of https://github.com/dimakarp1996/dream into new_…
Dec 20, 2022
fd58462
Update test.py
dimakarp1996 Dec 20, 2022
c825ebc
Update test.py
dimakarp1996 Dec 20, 2022
bea2b3f
Update Dockerfile
dimakarp1996 Dec 21, 2022
e8b63ac
suggested changes
Dec 21, 2022
da002bc
suggested changes
Dec 21, 2022
372a549
Update Dockerfile
dimakarp1996 Dec 26, 2022
2a3a3c6
Update combined_classifier.json
dimakarp1996 Dec 26, 2022
07f38c7
Update Dockerfile
dimakarp1996 Dec 26, 2022
f5a765e
Now we can infer for all postannotations
dimakarp1996 Dec 26, 2022
9b39fd5
Update Dockerfile
dimakarp1996 Dec 26, 2022
977b1e7
Speed up passing tests of combined classifier(in old configuration) 2…
dimakarp1996 Dec 26, 2022
cd95e81
Codestyle. Tests of combined (in old config) are passed 2x faster tha…
Dec 26, 2022
2c48836
Decreased inference time by another 25% without quality loss
Dec 28, 2022
25bd3f8
Merge pull request #19 from deeppavlov/dev
dimakarp1996 Jan 9, 2023
f4c4a9b
fixed tests
Jan 18, 2023
5f762fa
codestyle
Jan 18, 2023
84b3ac5
New log format
dimakarp1996 Jan 18, 2023
c6c8290
Update combined_classifier.json
dimakarp1996 Jan 18, 2023
2e3d65b
Changed singletask sentence length from 64 to 32 to speed model up by…
dimakarp1996 Jan 18, 2023
5dcbdc7
Update server.py
dimakarp1996 Jan 18, 2023
7ae9dc5
Remove MIDAS from pipeline
dimakarp1996 Jan 18, 2023
33396d4
Remove MIDAS from pipeline
dimakarp1996 Jan 18, 2023
8ff0c80
remove MIDAS from WAIT_HOSTS
dimakarp1996 Jan 18, 2023
7f8538c
MIDAS from pipeline
dimakarp1996 Jan 18, 2023
85f7e2c
Убираю Мидас из WAIT_HOSTS и пайплайна
dimakarp1996 Jan 18, 2023
6cd5c9e
Убираю Мидас из WAIT_HOSTS и пайплайна
dimakarp1996 Jan 18, 2023
715637c
Merge pull request #20 from deeppavlov/dev
dimakarp1996 Jan 19, 2023
d2178bc
Not referring to midas annotations anymore and not waiting for midas
Jan 19, 2023
e9eb773
All combined classification timeouts set to 2 sec
Jan 19, 2023
2708460
Update dp_formatters.py
dimakarp1996 Jan 19, 2023
1f0477f
Update dp_formatters.py
dimakarp1996 Jan 19, 2023
fa7bdda
Midas classification is a dict
dimakarp1996 Jan 19, 2023
e61cf56
Get intents by function from common
dimakarp1996 Jan 20, 2023
4616ac1
Update README.md
dimakarp1996 Jan 20, 2023
cda8494
codestyle
Jan 20, 2023
037962f
Merge pull request #21 from deeppavlov/dev
dimakarp1996 Jan 23, 2023
1092750
Merge dev
Jan 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 3 additions & 11 deletions annotators/combined_classification/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,14 +1,8 @@
FROM deeppavlov/base-gpu:0.12.1
RUN pip install git+https://github.com/deeppavlov/DeepPavlov.git@0.12.1
FROM deeppavlov/base-gpu:0.17.5

#RUN rm DeepPavlov
RUN pip install git+https://github.com/deeppavlov/DeepPavlov.git@1f0bda76c7c3fd6ccd4a1c0c880c0fffb73522d1

#Set up git lfs for your user account: git lfs install
WORKDIR /base
RUN rm -rf DeepPavlov
RUN git clone https://github.com/dimakarp1996/DeepPavlov.git
WORKDIR /base/DeepPavlov
RUN git checkout pal-bert+ner

ARG CONFIG

Expand All @@ -24,8 +18,6 @@ RUN pip install -r requirements.txt

COPY annotators/combined_classification/ ./
COPY common/ common/
RUN ls /tmp

ARG DATA_URL=http://files.deeppavlov.ai/alexaprize_data/pal_bert_7in1/model.pth.tar
ADD $DATA_URL /tmp
CMD gunicorn --workers=1 --bind 0.0.0.0:8087 --timeout=1200 server:app

26 changes: 25 additions & 1 deletion annotators/combined_classification/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,25 @@
BERT Base model for 6 tasks - cobot topics cobot dialogact topics cobot dialogact intent emotion sentiment toxic
This model is based on the transformer-agnostic multitask neural architecture. It can solve several tasks similtaneously, almost as good as single-task models.

The models were trained on the following datasets:

**Factoid classification** : For the Factoid task, we used the same Yahoo ConversVsInfo dataset that was used to train the Dream socialbot in Alexa Prize . Note that the valid set in this task was equal to the test set.

**Midas classification** : For the Midas task, we used the same Midas classification dataset that was used to train the Dream socialbot in Alexa Prize . Note that the valid set in this task was equal to the test set.

**Emotion classification** :For the Emotion classification task, we used the emo\_go\_emotions dataset, with all the 28 classes compressed into the seven basic emotions as in the original paper. Note that these 7 emotions are not exactly the same as the 7 emotions in the original Dream socialbot in Alexa Prize: 1 emotion differs (love VS disgust), so the scores are incomparable with the original model. Note that this task is multiclass.

**Topic classification**: For the Topic classification task, we used the dataset made by Dilyara Zharikova. The dataset was further filtered and improved for the final model version, to make the model suitable for DREAM. Note that the original topics model doesn’t account for that dataset changes(which were also about class number) and thus its scores are not compatible with the scores we have.

**Sentiment classification** : For the Sentiment classification task, we used the Dynabench dataset (r1 + r2).

**Toxic classification** : For the toxic classification task, we used the dataset from kaggle <https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/datawith> the 7 toxic classes that pose an interest to us. Note that this task is multilabel.

The model also contains 3 replacement models for Amazon services.

The models (multitask and comparative single task) were trained with initial learning rate 2e-5(with validation patience 2 it could be dropped 2 times), batch size 32,optimizer adamW(betas (0.9,0.99) and early stop on 3 epochs. The criteria on early stopping was average accuracy for all tasks for multitask models, or the single-task accuracy for singletask models.

This model(with a distilbert-base-uncased backbone) takes only 2439 Mb for 9 tasks, whereas single-task models with the same backbone for every of these tasks take up almost the same memory(~2437 Mb for every of these 9 tasks).

CPU memory use of this model is 2909 Mb.


Loading