Skip to content

Releases: explosion/spacy-models

mk_core_news_md-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: ddffbbf27be644f8ba92a7ab70b35e2447639cfd65d6891a4ca05931ddbf88c6
Checksum .whl: 06ab12bc3a833b7b7562aeb839740211a04f913be7562e1e8520fd440cc2e9de

Details: https://spacy.io/models/mk#mk_core_news_md

Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name mk_core_news_md
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline morphologizer, parser, attribute_ruler, lemmatizer, ner
Components morphologizer, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 274587 keys, 20000 unique vectors (300 dimensions)
Sources Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska)
spaCy lookups data (Explosion)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 42 MB

Label Scheme

View label scheme (54 labels for 3 components)
Component Labels
morphologizer POS=PROPN, POS=AUX, POS=ADJ, POS=NOUN, POS=ADP, POS=PUNCT, POS=CONJ, POS=NUM, POS=VERB, POS=PRON, POS=ADV, POS=SCONJ, POS=PART, POS=SYM, _, POS=SPACE, POS=X, POS=INTJ
parser ROOT, advmod, att, aux, cc, dep, det, dobj, iobj, neg, nsubj, pobj, poss, pozm, pozv, prep, punct, relcl
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 100.00
TOKEN_P 100.00
TOKEN_R 100.00
TOKEN_F 100.00
SENTS_P 80.00
SENTS_R 67.53
SENTS_F 73.24
DEP_UAS 67.71
DEP_LAS 52.01
ENTS_P 74.72
ENTS_R 74.47
ENTS_F 74.60
POS_ACC 92.61

Installation

pip install spacy
python -m spacy download mk_core_news_md

mk_core_news_lg-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 1d537095831c7861e9a3ccddc05da9b95c71b4f37f7627f15a6c4304e4f5503b
Checksum .whl: 3fcd0244d5b2d5b4f74ccff27b495814bfefcfb5e7c96b49a44e86a51d2e0408

Details: https://spacy.io/models/mk#mk_core_news_lg

Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name mk_core_news_lg
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline morphologizer, parser, attribute_ruler, lemmatizer, ner
Components morphologizer, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 274587 keys, 274587 unique vectors (300 dimensions)
Sources Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska)
spaCy lookups data (Explosion)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 310 MB

Label Scheme

View label scheme (54 labels for 3 components)
Component Labels
morphologizer POS=PROPN, POS=AUX, POS=ADJ, POS=NOUN, POS=ADP, POS=PUNCT, POS=CONJ, POS=NUM, POS=VERB, POS=PRON, POS=ADV, POS=SCONJ, POS=PART, POS=SYM, _, POS=SPACE, POS=X, POS=INTJ
parser ROOT, advmod, att, aux, cc, dep, det, dobj, iobj, neg, nsubj, pobj, poss, pozm, pozv, prep, punct, relcl
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 100.00
TOKEN_P 100.00
TOKEN_R 100.00
TOKEN_F 100.00
SENTS_P 70.42
SENTS_R 64.94
SENTS_F 67.57
DEP_UAS 67.84
DEP_LAS 52.98
ENTS_P 75.06
ENTS_R 75.06
ENTS_F 75.06
POS_ACC 93.09

Installation

pip install spacy
python -m spacy download mk_core_news_lg

lt_core_news_sm-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 4b3af0a6e692325b1368d17a94ab8d68d654215d84c5aaac58820dc3da5504c8
Checksum .whl: 8ff55726043b411b9547824b465487901377ac12e0511b2ed579cec8d111fe69

Details: https://spacy.io/models/lt#lt_core_news_sm

Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name lt_core_news_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, morphologizer, tagger, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, morphologizer, tagger, parser, lemmatizer, senter, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta)
TokenMill NER Corpus (TokenMill)
License CC BY-SA 4.0
Author Explosion
Model size 12 MB

Label Scheme

View label scheme (1669 labels for 4 components)
Component Labels
morphologizer Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, POS=VERB|Polarity=Pos|VerbForm=Inf, Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Gen|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Gender=Masc|Number=Plur|POS=NOUN, Case=Acc|Gender=Masc|Number=Plur|POS=NOUN, POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger, Case=Gen|Gender=Masc|Number=Sing|POS=NOUN, POS=CCONJ, POS=PUNCT, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Acc|Gender=Fem|Number=Plur|POS=NOUN, Case=Loc|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act, Case=Acc|Gender=Masc|Number=Sing|POS=NOUN, Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Acc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Gender=Fem|Number=Sing|POS=NOUN, Case=Nom|Gender=Masc|Number=Sing|POS=NOUN, Abbr=Yes|POS=X, AdpType=Prep|Case=Gen|POS=ADP, Case=Gen|Gender=Masc|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Nom|Gender=Fem|Number=Plur|POS=NOUN, Case=Ins|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Plur|POS=NOUN, Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Degree=Pos|POS=ADV, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Degree=Pos|Hyph=Yes|POS=ADV, Hyph=Yes|POS=X, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=SCONJ, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Ind|POS=PRON|PronType=Ind, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Gender=Fem|Number=Sing|POS=NOUN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|POS=PRON|PronType=Ind, POS=PART, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem, Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM, Case=Ins|Gender=Masc|Number=Plur|POS=NOUN, Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Definite=Ind|Gender=Neut|POS=DET|PronType=Dem, Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Sing|POS=PROPN, Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Loc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger, Case=Dat|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf, Degree=Cmp|POS=ADV, Case=Gen|Gender=Fem|Number=Sing|POS=PROPN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Definite=Ind|NumForm=Digit|POS=NUM, Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM, Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass, Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, NumForm=Word|NumType=Card|POS=NUM, Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem, Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel, Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Dat|Gender=Masc|Number=Plur|POS=NOUN, Case=Nom|Gender=Fem|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs, Hyph=Yes|POS=PART, Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs, Case=Loc|Gender=Masc|Number=Sing|POS=NOUN, AdpType=Prep|Case=Acc|POS=ADP, Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Definite=Ind|NumForm=Roman|POS=NUM, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part, Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ, Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Nom|Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Int,Rel, Degree=Sup|POS=ADV, `Case=Nom|Definite=Ind|Gender=Fem|Number=Plur|POS=VE...
Read more

lt_core_news_md-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: e8a5ba3577190133eaa9a9a6764dad4a40a72008021a59f04942135e64d42784
Checksum .whl: bf793c09c47795fad7fe1a4d310a885abf6df662499ffa4be6102649782a4f82

Details: https://spacy.io/models/lt#lt_core_news_md

Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name lt_core_news_md
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, morphologizer, tagger, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, morphologizer, tagger, parser, lemmatizer, senter, attribute_ruler, ner
Vectors 500000 keys, 20000 unique vectors (300 dimensions)
Sources UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta)
TokenMill NER Corpus (TokenMill)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 40 MB

Label Scheme

View label scheme (1669 labels for 4 components)
Component Labels
morphologizer Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, POS=VERB|Polarity=Pos|VerbForm=Inf, Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Gen|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Gender=Masc|Number=Plur|POS=NOUN, Case=Acc|Gender=Masc|Number=Plur|POS=NOUN, POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger, Case=Gen|Gender=Masc|Number=Sing|POS=NOUN, POS=CCONJ, POS=PUNCT, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Acc|Gender=Fem|Number=Plur|POS=NOUN, Case=Loc|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act, Case=Acc|Gender=Masc|Number=Sing|POS=NOUN, Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Acc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Gender=Fem|Number=Sing|POS=NOUN, Case=Nom|Gender=Masc|Number=Sing|POS=NOUN, Abbr=Yes|POS=X, AdpType=Prep|Case=Gen|POS=ADP, Case=Gen|Gender=Masc|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Nom|Gender=Fem|Number=Plur|POS=NOUN, Case=Ins|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Plur|POS=NOUN, Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Degree=Pos|POS=ADV, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Degree=Pos|Hyph=Yes|POS=ADV, Hyph=Yes|POS=X, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=SCONJ, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Ind|POS=PRON|PronType=Ind, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Gender=Fem|Number=Sing|POS=NOUN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|POS=PRON|PronType=Ind, POS=PART, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem, Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM, Case=Ins|Gender=Masc|Number=Plur|POS=NOUN, Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Definite=Ind|Gender=Neut|POS=DET|PronType=Dem, Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Sing|POS=PROPN, Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Loc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger, Case=Dat|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf, Degree=Cmp|POS=ADV, Case=Gen|Gender=Fem|Number=Sing|POS=PROPN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Definite=Ind|NumForm=Digit|POS=NUM, Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM, Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass, Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, NumForm=Word|NumType=Card|POS=NUM, Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem, Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel, Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Dat|Gender=Masc|Number=Plur|POS=NOUN, Case=Nom|Gender=Fem|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs, Hyph=Yes|POS=PART, Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs, Case=Loc|Gender=Masc|Number=Sing|POS=NOUN, AdpType=Prep|Case=Acc|POS=ADP, Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Definite=Ind|NumForm=Roman|POS=NUM, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part, Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ, Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, `Case=Nom|Definite=Ind|Gender=Fem|Number...
Read more

lt_core_news_lg-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 8fd6f9fd4b0fce21a8492882e945d8cac723011bba20cd301103734b4544979c
Checksum .whl: ee805642db51d8324a63ed313f2a146357521fb717fe207e818afa06b4e374f1

Details: https://spacy.io/models/lt#lt_core_news_lg

Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name lt_core_news_lg
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, morphologizer, tagger, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, morphologizer, tagger, parser, lemmatizer, senter, attribute_ruler, ner
Vectors 500000 keys, 500000 unique vectors (300 dimensions)
Sources UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta)
TokenMill NER Corpus (TokenMill)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 541 MB

Label Scheme

View label scheme (1669 labels for 4 components)
Component Labels
morphologizer Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, POS=VERB|Polarity=Pos|VerbForm=Inf, Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Gen|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Gender=Masc|Number=Plur|POS=NOUN, Case=Acc|Gender=Masc|Number=Plur|POS=NOUN, POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger, Case=Gen|Gender=Masc|Number=Sing|POS=NOUN, POS=CCONJ, POS=PUNCT, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Acc|Gender=Fem|Number=Plur|POS=NOUN, Case=Loc|Gender=Fem|Number=Plur|POS=NOUN, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act, Case=Acc|Gender=Masc|Number=Sing|POS=NOUN, Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Acc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Gender=Fem|Number=Sing|POS=NOUN, Case=Nom|Gender=Masc|Number=Sing|POS=NOUN, Abbr=Yes|POS=X, AdpType=Prep|Case=Gen|POS=ADP, Case=Gen|Gender=Masc|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, Case=Nom|Gender=Fem|Number=Plur|POS=NOUN, Case=Ins|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Plur|POS=NOUN, Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Degree=Pos|POS=ADV, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Degree=Pos|Hyph=Yes|POS=ADV, Hyph=Yes|POS=X, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=SCONJ, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Ind|POS=PRON|PronType=Ind, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Gender=Fem|Number=Sing|POS=NOUN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Acc|Definite=Ind|POS=PRON|PronType=Ind, POS=PART, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem, Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM, Case=Ins|Gender=Masc|Number=Plur|POS=NOUN, Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Definite=Ind|Gender=Neut|POS=DET|PronType=Dem, Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin, Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|Number=Sing|POS=PROPN, Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass, Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Loc|Gender=Fem|Number=Sing|POS=NOUN, Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger, Case=Dat|Gender=Masc|Number=Sing|POS=NOUN, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf, Degree=Cmp|POS=ADV, Case=Gen|Gender=Fem|Number=Sing|POS=PROPN, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind, Definite=Ind|NumForm=Digit|POS=NUM, Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM, Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot, Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass, Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind, Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM, NumForm=Word|NumType=Card|POS=NUM, Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem, Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin, Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel, Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Dat|Gender=Masc|Number=Plur|POS=NOUN, Case=Nom|Gender=Fem|Number=Sing|POS=PROPN, Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs, Hyph=Yes|POS=PART, Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs, Case=Loc|Gender=Masc|Number=Sing|POS=NOUN, AdpType=Prep|Case=Acc|POS=ADP, Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin, Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ, Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin, Definite=Ind|NumForm=Roman|POS=NUM, Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM, Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem, Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part, Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ, Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act, Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs, Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin, Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act, Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ, `Case=Nom|Definite=Ind|Gender=Fem|Numb...
Read more

ko_core_news_sm-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: f0560bae70204fbe3d977ec98f9355a80e4b984b2ddf76bfd2f02137a3a6a19a
Checksum .whl: b1a15a4987a8f9835031a6bd2fe57fe158097ab5304221c41df1bd4aab8cf458

Details: https://spacy.io/models/ko#ko_core_news_sm

Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name ko_core_news_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, morphologizer, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, tagger, morphologizer, parser, lemmatizer, senter, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol)
KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho)
License CC BY-SA 4.0
Author Explosion
Model size 13 MB

Label Scheme

View label scheme (2028 labels for 4 components)
Component Labels
tagger _SP, ecs, etm, f, f+f+jcj, f+f+jcs, f+f+jct, f+f+jxt, f+jca, f+jca+jp+ecc, f+jca+jp+ep+ef, f+jca+jxc, f+jca+jxc+jcm, f+jca+jxt, f+jcj, f+jcm, f+jco, f+jcs, f+jct, f+jct+jcm, f+jp+ef, f+jp+ep+ef, f+jp+etm, f+jxc, f+jxt, f+ncn, f+ncn+jcm, f+ncn+jcs, f+ncn+jp+ecc, f+ncn+jxt, f+ncpa+jcm, f+npp+jcs, f+nq, f+xsn, f+xsn+jco, f+xsn+jxt, ii, jca, jca+jcm, jca+jxc, jca+jxt, jcc, jcj, jcm, jco, jcr, jcr+jxc, jcs, jct, jct+jcm, jct+jxt, jp+ecc, jp+ecs, jp+ef, jp+ef+jcr, jp+ef+jcr+jxc, jp+ep+ecs, jp+ep+ef, jp+ep+etm, jp+ep+etn, jp+etm, jp+etn, jp+etn+jco, jp+etn+jxc, jxc, jxc+jca, jxc+jco, jxc+jcs, jxt, mad, mad+jxc, mad+jxt, mag, mag+jca, mag+jcm, mag+jcs, mag+jp+ef+jcr, mag+jxc, mag+jxc+jxc, mag+jxt, mag+xsn, maj, maj+jxc, maj+jxt, mma, mmd, nbn, nbn+jca, nbn+jca+jcj, nbn+jca+jcm, nbn+jca+jp+ef, nbn+jca+jxc, nbn+jca+jxt, nbn+jcc, nbn+jcj, nbn+jcm, nbn+jco, nbn+jcr, nbn+jcs, nbn+jct, nbn+jct+jcm, nbn+jct+jxt, nbn+jp+ecc, nbn+jp+ecs, nbn+jp+ecs+jca, nbn+jp+ecs+jcm, nbn+jp+ecs+jco, nbn+jp+ecs+jxc, nbn+jp+ecs+jxt, nbn+jp+ecx, nbn+jp+ef, nbn+jp+ef+jca, nbn+jp+ef+jco, nbn+jp+ef+jcr, nbn+jp+ef+jcr+jxc, nbn+jp+ef+jcr+jxt, nbn+jp+ef+jcs, nbn+jp+ef+jxc, nbn+jp+ef+jxc+jco, nbn+jp+ef+jxf, nbn+jp+ef+jxt, nbn+jp+ep+ecc, nbn+jp+ep+ecs, nbn+jp+ep+ecs+jxc, nbn+jp+ep+ef, nbn+jp+ep+ef+jcr, nbn+jp+ep+etm, nbn+jp+ep+etn, nbn+jp+ep+etn+jco, nbn+jp+ep+etn+jcs, nbn+jp+etm, nbn+jp+etn, nbn+jp+etn+jca, nbn+jp+etn+jca+jxt, nbn+jp+etn+jco, nbn+jp+etn+jcs, nbn+jp+etn+jxc, nbn+jp+etn+jxt, nbn+jxc, nbn+jxc+jca, nbn+jxc+jca+jxc, nbn+jxc+jca+jxt, nbn+jxc+jcc, nbn+jxc+jcm, nbn+jxc+jco, nbn+jxc+jcs, nbn+jxc+jp+ef, nbn+jxc+jxc, nbn+jxc+jxt, nbn+jxt, nbn+nbn, nbn+nbn+jp+ef, nbn+xsm+ecs, nbn+xsm+ef, nbn+xsm+ep+ef, nbn+xsm+ep+ef+jcr, nbn+xsm+etm, nbn+xsn, nbn+xsn+jca, nbn+xsn+jca+jp+ef+jcr, nbn+xsn+jca+jxc, nbn+xsn+jca+jxt, nbn+xsn+jcm, nbn+xsn+jco, nbn+xsn+jcs, nbn+xsn+jct, nbn+xsn+jp+ecc, nbn+xsn+jp+ecs, nbn+xsn+jp+ef, nbn+xsn+jp+ef+jcr, nbn+xsn+jp+ep+ef, nbn+xsn+jxc, nbn+xsn+jxt, nbn+xsv+etm, nbu, nbu+jca, nbu+jca+jxc, nbu+jca+jxt, nbu+jcc, nbu+jcc+jxc, nbu+jcj, nbu+jcm, nbu+jco, nbu+jcs, nbu+jct, nbu+jct+jxc, nbu+jp+ecc, nbu+jp+ecs, nbu+jp+ef, nbu+jp+ef+jcr, nbu+jp+ef+jxc, nbu+jp+ep+ecc, nbu+jp+ep+ecs, nbu+jp+ep+ef, nbu+jp+ep+ef+jcr, nbu+jp+ep+etm, nbu+jp+ep+etn+jco, nbu+jp+etm, nbu+jxc, nbu+jxc+jca, nbu+jxc+jcs, nbu+jxc+jp+ef, nbu+jxc+jp+ep+ef, nbu+jxc+jxt, nbu+jxt, nbu+ncn, nbu+ncn+jca, nbu+ncn+jcm, nbu+xsn, nbu+xsn+jca, nbu+xsn+jca+jxc, nbu+xsn+jca+jxt, nbu+xsn+jcm, nbu+xsn+jco, nbu+xsn+jcs, nbu+xsn+jp+ecs, nbu+xsn+jp+ep+ef, nbu+xsn+jxc, nbu+xsn+jxc+jxt, nbu+xsn+jxt, nbu+xsv+ecc, nbu+xsv+etm, ncn, ncn+f+ncpa+jco, ncn+jca, ncn+jca+jca, ncn+jca+jcc, ncn+jca+jcj, ncn+jca+jcm, ncn+jca+jcs, ncn+jca+jct, ncn+jca+jp+ecc, ncn+jca+jp+ecs, ncn+jca+jp+ef, ncn+jca+jp+ep+ef, ncn+jca+jp+etm, ncn+jca+jp+etn+jxt, ncn+jca+jxc, ncn+jca+jxc+jcc, ncn+jca+jxc+jcm, ncn+jca+jxc+jxc, ncn+jca+jxc+jxt, ncn+jca+jxt, ncn+jcc, ncn+jcc+jxc, ncn+jcj, ncn+jcj+jxt, ncn+jcm, ncn+jco, ncn+jcr, ncn+jcr+jxc, ncn+jcs, ncn+jcs+jxt, ncn+jct, ncn+jct+jcm, ncn+jct+jxc, ncn+jct+jxt, ncn+jcv, ncn+jp+ecc, ncn+jp+ecc+jct, ncn+jp+ecc+jxc, ncn+jp+ecs, ncn+jp+ecs+jcm, ncn+jp+ecs+jco, ncn+jp+ecs+jxc, ncn+jp+ecs+jxt, ncn+jp+ecx, ncn+jp+ef, ncn+jp+ef+jca, ncn+jp+ef+jcm, ncn+jp+ef+jco, ncn+jp+ef+jcr, ncn+jp+ef+jcr+jxc, ncn+jp+ef+jcr+jxt, ncn+jp+ef+jp+etm, ncn+jp+ef+jxc, ncn+jp+ef+jxf, ncn+jp+ef+jxt, ncn+jp+ep+ecc, ncn+jp+ep+ecs, ncn+jp+ep+ecs+jxc, ncn+jp+ep+ecx, ncn+jp+ep+ef, ncn+jp+ep+ef+jcr, ncn+jp+ep+ef+jcr+jxc, ncn+jp+ep+ef+jxc, ncn+jp+ep+ef+jxf, ncn+jp+ep+ef+jxt, ncn+jp+ep+ep+etm, ncn+jp+ep+etm, ncn+jp+ep+etn, ncn+jp+ep+etn+jca, ncn+jp+ep+etn+jca+jxc, ncn+jp+ep+etn+jco, ncn+jp+ep+etn+jcs, ncn+jp+ep+etn+jxt, ncn+jp+etm, ncn+jp+etn, ncn+jp+etn+jca, ncn+jp+etn+jca+jxc, ncn+jp+etn+jca+jxt, ncn+jp+etn+jco, ncn+jp+etn+jcs, ncn+jp+etn+jct, ncn+jp+etn+jxc, ncn+jp+etn+jxt, ncn+jxc, ncn+jxc+jca, ncn+jxc+jca+jxc, ncn+jxc+jca+jxt, ncn+jxc+jcc, ncn+jxc+jcm, ncn+jxc+jco, ncn+jxc+jcs, ncn+jxc+jct+jxt, ncn+jxc+jp+ef, ncn+jxc+jp+ef+jcr, ncn+jxc+jp+ep+ecs, ncn+jxc+jp+ep+ef, ncn+jxc+jp+etm, ncn+jxc+jxc, ncn+jxc+jxt, ncn+jxt, ncn+jxt+jcm, ncn+jxt+jxc, ncn+nbn, ncn+nbn+jca, ncn+nbn+jcm, ncn+nbn+jcs, ncn+nbn+jp+ecc, ncn+nbn+jp+ep+ef, ncn+nbn+jxc, ncn+nbn+jxt, ncn+nbu, ncn+nbu+jca, ncn+nbu+jcm, ncn+nbu+jco, ncn+nbu+jp+ef, ncn+nbu+jxc, ncn+nbu+ncn, ncn+ncn, ncn+ncn+jca, ncn+ncn+jca+jcc, ncn+ncn+jca+jcm, ncn+ncn+jca+jxc, ncn+ncn+jca+jxc+jcm, ncn+ncn+jca+jxc+jxc, ncn+ncn+jca+jxt, ncn+ncn+jcc, ncn+ncn+jcj, ncn+ncn+jcm, ncn+ncn+jco, ncn+ncn+jcr, ncn+ncn+jcs, ncn+ncn+jct, ncn+ncn+jct+jcm, ncn+ncn+jct+jxc, ncn+ncn+jct+jxt, ncn+ncn+jp+ecc, ncn+ncn+jp+ecs, ncn+ncn+jp+ef, ncn+ncn+jp+ef+jcm, ncn+ncn+jp+ef+jcr, ncn+ncn+jp+ef+jcs, ncn+ncn+jp+ep+ecc, ncn+ncn+jp+ep+ecs, ncn+ncn+jp+ep+ef, ncn+ncn+jp+ep+ef+jcr, ncn+ncn+jp+ep+ep+etm, ncn+ncn+jp+ep+etm, ncn+ncn+jp+ep+etn, ncn+ncn+jp+etm, ncn+ncn+jp+etn, ncn+ncn+jp+etn+jca, ncn+ncn+jp+etn+jco, ncn+ncn+jp+etn+jxc, ncn+ncn+jxc, ncn+ncn+jxc+jca, ncn+ncn+jxc+jcc, ncn+ncn+jxc+jcm, ncn+ncn+jxc+jco, ncn+ncn+jxc+jcs, ncn+ncn+jxc+jxc, ncn+ncn+jxt, ncn+ncn+nbn, ncn+ncn+ncn, ncn+ncn+ncn+jca, ncn+ncn+ncn+jca+jcm, ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+jcj, ncn+ncn+ncn+jcm, ncn+ncn+ncn+jco, ncn+ncn+ncn+jcs, ncn+ncn+ncn+jct+jxt, ncn+ncn+ncn+jp+etn+jxc, ncn+ncn+ncn+jxt, ncn+ncn+ncn+ncn+jca, ncn+ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+ncn+jco, ncn+ncn+ncn+xsn+jp+etm, ncn+ncn+ncpa, ncn+ncn+ncpa+jca, ncn+ncn+ncpa+jcm, ncn+ncn+ncpa+jco, ncn+ncn+ncpa+jcs, ncn+ncn+ncpa+jxc, ncn+ncn+ncpa+jxt, ncn+ncn+ncpa+ncn, ncn+ncn+ncpa+ncn+jca, ncn+ncn+ncpa+ncn+jcj, ncn+ncn+ncpa+ncn+jcm, ncn+ncn+ncpa+ncn+jxt, ncn+ncn+xsn, ncn+ncn+xsn+jca, ncn+ncn+xsn+jca+jxt, ncn+ncn+xsn+jcj, ncn+ncn+xsn+jcm, ncn+ncn+xsn+jco, ncn+ncn+xsn+jcs, ncn+ncn+xsn+jct, ncn+ncn+xsn+jp+ecs, ncn+ncn+xsn+jp+ep+ef, ncn+ncn+xsn+jp+etm, ncn+ncn+xsn+jxc, ncn+ncn+xsn+jxc+jcs, ncn+ncn+xsn+jxt, ncn+ncn+xsv+ecc, ncn+ncn+xsv+etm, ncn+ncpa, ncn+ncpa+jca, ncn+ncpa+jca+jcm, ncn+ncpa+jca+jxc, ncn+ncpa+jca+jxt, ncn+ncpa+jcc, ncn+ncpa+jcj, ncn+ncpa+jcm, ncn+ncpa+jco, ncn+ncpa+jcr, ncn+ncpa+jcs, ncn+ncpa+jct, ncn+ncpa+jct+jcm, ncn+ncpa+jct+jxt, ncn+ncpa+jp+ecc, ncn+ncpa+jp+ecc+jxc, ncn+ncpa+jp+ecs, ncn+ncpa+jp+ecs+jxc, ncn+ncpa+jp+ef, ncn+ncpa+jp+ef+jcr, ncn+ncpa+jp+ef+jcr+jxc, ncn+ncpa+jp+ep+ef, ncn+ncpa+jp+ep+etm, ncn+ncpa+jp+ep+etn, ncn+ncpa+jp+etm, ncn+ncpa+jxc, ncn+ncpa+jxc+jca+jxc, ncn+ncpa+jxc+jco, ncn+ncpa+jxc+jcs, ncn+ncpa+jxt, ncn+ncpa+nbn+jcs, ncn+ncpa+ncn, ncn+ncpa+ncn+jca, ncn+ncpa+ncn+jca+jcm, ncn+ncpa+ncn+jca+jxc, ncn+ncpa+ncn+jca+jxt, ncn+ncpa+ncn+jcj, ncn+ncpa+ncn+jcm, ncn+ncpa+ncn+jco, ncn+ncpa+ncn+jcs, ncn+ncpa+ncn+jct, ncn+ncpa+ncn+jct+jcm, ncn+ncpa+ncn+jp+ef+jcr, ncn+ncpa+ncn+jp+ep+etm, ncn+ncpa+ncn+jxc, ncn+ncpa+ncn+jxt, ncn+ncpa+ncn+xsn+jcm, ncn+ncpa+ncn+xsn+jxt, ncn+ncpa+ncpa, `ncn+ncpa+ncpa+jca...
Read more

ko_core_news_md-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 35eb302d8b80d6a0d5cf4a33682f13773a920b0f800ae41b90f93054a66727aa
Checksum .whl: 97565293b1916eb20a47ec9bc96ee58aa8c334787d7a30b0efbea63ba6205165

Details: https://spacy.io/models/ko#ko_core_news_md

Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name ko_core_news_md
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, morphologizer, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, tagger, morphologizer, parser, lemmatizer, senter, attribute_ruler, ner
Vectors floret (50000, 300)
Sources UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol)
KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho)
Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 65 MB

Label Scheme

View label scheme (2028 labels for 4 components)
Component Labels
tagger _SP, ecs, etm, f, f+f+jcj, f+f+jcs, f+f+jct, f+f+jxt, f+jca, f+jca+jp+ecc, f+jca+jp+ep+ef, f+jca+jxc, f+jca+jxc+jcm, f+jca+jxt, f+jcj, f+jcm, f+jco, f+jcs, f+jct, f+jct+jcm, f+jp+ef, f+jp+ep+ef, f+jp+etm, f+jxc, f+jxt, f+ncn, f+ncn+jcm, f+ncn+jcs, f+ncn+jp+ecc, f+ncn+jxt, f+ncpa+jcm, f+npp+jcs, f+nq, f+xsn, f+xsn+jco, f+xsn+jxt, ii, jca, jca+jcm, jca+jxc, jca+jxt, jcc, jcj, jcm, jco, jcr, jcr+jxc, jcs, jct, jct+jcm, jct+jxt, jp+ecc, jp+ecs, jp+ef, jp+ef+jcr, jp+ef+jcr+jxc, jp+ep+ecs, jp+ep+ef, jp+ep+etm, jp+ep+etn, jp+etm, jp+etn, jp+etn+jco, jp+etn+jxc, jxc, jxc+jca, jxc+jco, jxc+jcs, jxt, mad, mad+jxc, mad+jxt, mag, mag+jca, mag+jcm, mag+jcs, mag+jp+ef+jcr, mag+jxc, mag+jxc+jxc, mag+jxt, mag+xsn, maj, maj+jxc, maj+jxt, mma, mmd, nbn, nbn+jca, nbn+jca+jcj, nbn+jca+jcm, nbn+jca+jp+ef, nbn+jca+jxc, nbn+jca+jxt, nbn+jcc, nbn+jcj, nbn+jcm, nbn+jco, nbn+jcr, nbn+jcs, nbn+jct, nbn+jct+jcm, nbn+jct+jxt, nbn+jp+ecc, nbn+jp+ecs, nbn+jp+ecs+jca, nbn+jp+ecs+jcm, nbn+jp+ecs+jco, nbn+jp+ecs+jxc, nbn+jp+ecs+jxt, nbn+jp+ecx, nbn+jp+ef, nbn+jp+ef+jca, nbn+jp+ef+jco, nbn+jp+ef+jcr, nbn+jp+ef+jcr+jxc, nbn+jp+ef+jcr+jxt, nbn+jp+ef+jcs, nbn+jp+ef+jxc, nbn+jp+ef+jxc+jco, nbn+jp+ef+jxf, nbn+jp+ef+jxt, nbn+jp+ep+ecc, nbn+jp+ep+ecs, nbn+jp+ep+ecs+jxc, nbn+jp+ep+ef, nbn+jp+ep+ef+jcr, nbn+jp+ep+etm, nbn+jp+ep+etn, nbn+jp+ep+etn+jco, nbn+jp+ep+etn+jcs, nbn+jp+etm, nbn+jp+etn, nbn+jp+etn+jca, nbn+jp+etn+jca+jxt, nbn+jp+etn+jco, nbn+jp+etn+jcs, nbn+jp+etn+jxc, nbn+jp+etn+jxt, nbn+jxc, nbn+jxc+jca, nbn+jxc+jca+jxc, nbn+jxc+jca+jxt, nbn+jxc+jcc, nbn+jxc+jcm, nbn+jxc+jco, nbn+jxc+jcs, nbn+jxc+jp+ef, nbn+jxc+jxc, nbn+jxc+jxt, nbn+jxt, nbn+nbn, nbn+nbn+jp+ef, nbn+xsm+ecs, nbn+xsm+ef, nbn+xsm+ep+ef, nbn+xsm+ep+ef+jcr, nbn+xsm+etm, nbn+xsn, nbn+xsn+jca, nbn+xsn+jca+jp+ef+jcr, nbn+xsn+jca+jxc, nbn+xsn+jca+jxt, nbn+xsn+jcm, nbn+xsn+jco, nbn+xsn+jcs, nbn+xsn+jct, nbn+xsn+jp+ecc, nbn+xsn+jp+ecs, nbn+xsn+jp+ef, nbn+xsn+jp+ef+jcr, nbn+xsn+jp+ep+ef, nbn+xsn+jxc, nbn+xsn+jxt, nbn+xsv+etm, nbu, nbu+jca, nbu+jca+jxc, nbu+jca+jxt, nbu+jcc, nbu+jcc+jxc, nbu+jcj, nbu+jcm, nbu+jco, nbu+jcs, nbu+jct, nbu+jct+jxc, nbu+jp+ecc, nbu+jp+ecs, nbu+jp+ef, nbu+jp+ef+jcr, nbu+jp+ef+jxc, nbu+jp+ep+ecc, nbu+jp+ep+ecs, nbu+jp+ep+ef, nbu+jp+ep+ef+jcr, nbu+jp+ep+etm, nbu+jp+ep+etn+jco, nbu+jp+etm, nbu+jxc, nbu+jxc+jca, nbu+jxc+jcs, nbu+jxc+jp+ef, nbu+jxc+jp+ep+ef, nbu+jxc+jxt, nbu+jxt, nbu+ncn, nbu+ncn+jca, nbu+ncn+jcm, nbu+xsn, nbu+xsn+jca, nbu+xsn+jca+jxc, nbu+xsn+jca+jxt, nbu+xsn+jcm, nbu+xsn+jco, nbu+xsn+jcs, nbu+xsn+jp+ecs, nbu+xsn+jp+ep+ef, nbu+xsn+jxc, nbu+xsn+jxc+jxt, nbu+xsn+jxt, nbu+xsv+ecc, nbu+xsv+etm, ncn, ncn+f+ncpa+jco, ncn+jca, ncn+jca+jca, ncn+jca+jcc, ncn+jca+jcj, ncn+jca+jcm, ncn+jca+jcs, ncn+jca+jct, ncn+jca+jp+ecc, ncn+jca+jp+ecs, ncn+jca+jp+ef, ncn+jca+jp+ep+ef, ncn+jca+jp+etm, ncn+jca+jp+etn+jxt, ncn+jca+jxc, ncn+jca+jxc+jcc, ncn+jca+jxc+jcm, ncn+jca+jxc+jxc, ncn+jca+jxc+jxt, ncn+jca+jxt, ncn+jcc, ncn+jcc+jxc, ncn+jcj, ncn+jcj+jxt, ncn+jcm, ncn+jco, ncn+jcr, ncn+jcr+jxc, ncn+jcs, ncn+jcs+jxt, ncn+jct, ncn+jct+jcm, ncn+jct+jxc, ncn+jct+jxt, ncn+jcv, ncn+jp+ecc, ncn+jp+ecc+jct, ncn+jp+ecc+jxc, ncn+jp+ecs, ncn+jp+ecs+jcm, ncn+jp+ecs+jco, ncn+jp+ecs+jxc, ncn+jp+ecs+jxt, ncn+jp+ecx, ncn+jp+ef, ncn+jp+ef+jca, ncn+jp+ef+jcm, ncn+jp+ef+jco, ncn+jp+ef+jcr, ncn+jp+ef+jcr+jxc, ncn+jp+ef+jcr+jxt, ncn+jp+ef+jp+etm, ncn+jp+ef+jxc, ncn+jp+ef+jxf, ncn+jp+ef+jxt, ncn+jp+ep+ecc, ncn+jp+ep+ecs, ncn+jp+ep+ecs+jxc, ncn+jp+ep+ecx, ncn+jp+ep+ef, ncn+jp+ep+ef+jcr, ncn+jp+ep+ef+jcr+jxc, ncn+jp+ep+ef+jxc, ncn+jp+ep+ef+jxf, ncn+jp+ep+ef+jxt, ncn+jp+ep+ep+etm, ncn+jp+ep+etm, ncn+jp+ep+etn, ncn+jp+ep+etn+jca, ncn+jp+ep+etn+jca+jxc, ncn+jp+ep+etn+jco, ncn+jp+ep+etn+jcs, ncn+jp+ep+etn+jxt, ncn+jp+etm, ncn+jp+etn, ncn+jp+etn+jca, ncn+jp+etn+jca+jxc, ncn+jp+etn+jca+jxt, ncn+jp+etn+jco, ncn+jp+etn+jcs, ncn+jp+etn+jct, ncn+jp+etn+jxc, ncn+jp+etn+jxt, ncn+jxc, ncn+jxc+jca, ncn+jxc+jca+jxc, ncn+jxc+jca+jxt, ncn+jxc+jcc, ncn+jxc+jcm, ncn+jxc+jco, ncn+jxc+jcs, ncn+jxc+jct+jxt, ncn+jxc+jp+ef, ncn+jxc+jp+ef+jcr, ncn+jxc+jp+ep+ecs, ncn+jxc+jp+ep+ef, ncn+jxc+jp+etm, ncn+jxc+jxc, ncn+jxc+jxt, ncn+jxt, ncn+jxt+jcm, ncn+jxt+jxc, ncn+nbn, ncn+nbn+jca, ncn+nbn+jcm, ncn+nbn+jcs, ncn+nbn+jp+ecc, ncn+nbn+jp+ep+ef, ncn+nbn+jxc, ncn+nbn+jxt, ncn+nbu, ncn+nbu+jca, ncn+nbu+jcm, ncn+nbu+jco, ncn+nbu+jp+ef, ncn+nbu+jxc, ncn+nbu+ncn, ncn+ncn, ncn+ncn+jca, ncn+ncn+jca+jcc, ncn+ncn+jca+jcm, ncn+ncn+jca+jxc, ncn+ncn+jca+jxc+jcm, ncn+ncn+jca+jxc+jxc, ncn+ncn+jca+jxt, ncn+ncn+jcc, ncn+ncn+jcj, ncn+ncn+jcm, ncn+ncn+jco, ncn+ncn+jcr, ncn+ncn+jcs, ncn+ncn+jct, ncn+ncn+jct+jcm, ncn+ncn+jct+jxc, ncn+ncn+jct+jxt, ncn+ncn+jp+ecc, ncn+ncn+jp+ecs, ncn+ncn+jp+ef, ncn+ncn+jp+ef+jcm, ncn+ncn+jp+ef+jcr, ncn+ncn+jp+ef+jcs, ncn+ncn+jp+ep+ecc, ncn+ncn+jp+ep+ecs, ncn+ncn+jp+ep+ef, ncn+ncn+jp+ep+ef+jcr, ncn+ncn+jp+ep+ep+etm, ncn+ncn+jp+ep+etm, ncn+ncn+jp+ep+etn, ncn+ncn+jp+etm, ncn+ncn+jp+etn, ncn+ncn+jp+etn+jca, ncn+ncn+jp+etn+jco, ncn+ncn+jp+etn+jxc, ncn+ncn+jxc, ncn+ncn+jxc+jca, ncn+ncn+jxc+jcc, ncn+ncn+jxc+jcm, ncn+ncn+jxc+jco, ncn+ncn+jxc+jcs, ncn+ncn+jxc+jxc, ncn+ncn+jxt, ncn+ncn+nbn, ncn+ncn+ncn, ncn+ncn+ncn+jca, ncn+ncn+ncn+jca+jcm, ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+jcj, ncn+ncn+ncn+jcm, ncn+ncn+ncn+jco, ncn+ncn+ncn+jcs, ncn+ncn+ncn+jct+jxt, ncn+ncn+ncn+jp+etn+jxc, ncn+ncn+ncn+jxt, ncn+ncn+ncn+ncn+jca, ncn+ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+ncn+jco, ncn+ncn+ncn+xsn+jp+etm, ncn+ncn+ncpa, ncn+ncn+ncpa+jca, ncn+ncn+ncpa+jcm, ncn+ncn+ncpa+jco, ncn+ncn+ncpa+jcs, ncn+ncn+ncpa+jxc, ncn+ncn+ncpa+jxt, ncn+ncn+ncpa+ncn, ncn+ncn+ncpa+ncn+jca, ncn+ncn+ncpa+ncn+jcj, ncn+ncn+ncpa+ncn+jcm, ncn+ncn+ncpa+ncn+jxt, ncn+ncn+xsn, ncn+ncn+xsn+jca, ncn+ncn+xsn+jca+jxt, ncn+ncn+xsn+jcj, ncn+ncn+xsn+jcm, ncn+ncn+xsn+jco, ncn+ncn+xsn+jcs, ncn+ncn+xsn+jct, ncn+ncn+xsn+jp+ecs, ncn+ncn+xsn+jp+ep+ef, ncn+ncn+xsn+jp+etm, ncn+ncn+xsn+jxc, ncn+ncn+xsn+jxc+jcs, ncn+ncn+xsn+jxt, ncn+ncn+xsv+ecc, ncn+ncn+xsv+etm, ncn+ncpa, ncn+ncpa+jca, ncn+ncpa+jca+jcm, ncn+ncpa+jca+jxc, ncn+ncpa+jca+jxt, ncn+ncpa+jcc, ncn+ncpa+jcj, ncn+ncpa+jcm, ncn+ncpa+jco, ncn+ncpa+jcr, ncn+ncpa+jcs, ncn+ncpa+jct, ncn+ncpa+jct+jcm, ncn+ncpa+jct+jxt, ncn+ncpa+jp+ecc, ncn+ncpa+jp+ecc+jxc, ncn+ncpa+jp+ecs, ncn+ncpa+jp+ecs+jxc, ncn+ncpa+jp+ef, ncn+ncpa+jp+ef+jcr, ncn+ncpa+jp+ef+jcr+jxc, ncn+ncpa+jp+ep+ef, ncn+ncpa+jp+ep+etm, ncn+ncpa+jp+ep+etn, ncn+ncpa+jp+etm, ncn+ncpa+jxc, ncn+ncpa+jxc+jca+jxc, ncn+ncpa+jxc+jco, ncn+ncpa+jxc+jcs, ncn+ncpa+jxt, ncn+ncpa+nbn+jcs, ncn+ncpa+ncn, ncn+ncpa+ncn+jca, ncn+ncpa+ncn+jca+jcm, ncn+ncpa+ncn+jca+jxc, ncn+ncpa+ncn+jca+jxt, ncn+ncpa+ncn+jcj, ncn+ncpa+ncn+jcm, ncn+ncpa+ncn+jco, ncn+ncpa+ncn+jcs, ncn+ncpa+ncn+jct, ncn+ncpa+ncn+jct+jcm, ncn+ncpa+ncn+jp+ef+jcr, `ncn+ncpa+ncn+jp+ep+et...
Read more

ko_core_news_lg-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 2a3a2257342903b6a9edc658203010d2c26194083e0b13c2236ba3c2c39abc43
Checksum .whl: 125f607b91778c97bd5be65dbcb5b14c0c1231c1f8998d1a652ed396c03c6945

Details: https://spacy.io/models/ko#ko_core_news_lg

Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.

Feature Description
Name ko_core_news_lg
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, morphologizer, parser, lemmatizer, attribute_ruler, ner
Components tok2vec, tagger, morphologizer, parser, lemmatizer, senter, attribute_ruler, ner
Vectors floret (200000, 300)
Sources UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol)
KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho)
Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion)
License CC BY-SA 4.0
Author Explosion
Model size 220 MB

Label Scheme

View label scheme (2028 labels for 4 components)
Component Labels
tagger _SP, ecs, etm, f, f+f+jcj, f+f+jcs, f+f+jct, f+f+jxt, f+jca, f+jca+jp+ecc, f+jca+jp+ep+ef, f+jca+jxc, f+jca+jxc+jcm, f+jca+jxt, f+jcj, f+jcm, f+jco, f+jcs, f+jct, f+jct+jcm, f+jp+ef, f+jp+ep+ef, f+jp+etm, f+jxc, f+jxt, f+ncn, f+ncn+jcm, f+ncn+jcs, f+ncn+jp+ecc, f+ncn+jxt, f+ncpa+jcm, f+npp+jcs, f+nq, f+xsn, f+xsn+jco, f+xsn+jxt, ii, jca, jca+jcm, jca+jxc, jca+jxt, jcc, jcj, jcm, jco, jcr, jcr+jxc, jcs, jct, jct+jcm, jct+jxt, jp+ecc, jp+ecs, jp+ef, jp+ef+jcr, jp+ef+jcr+jxc, jp+ep+ecs, jp+ep+ef, jp+ep+etm, jp+ep+etn, jp+etm, jp+etn, jp+etn+jco, jp+etn+jxc, jxc, jxc+jca, jxc+jco, jxc+jcs, jxt, mad, mad+jxc, mad+jxt, mag, mag+jca, mag+jcm, mag+jcs, mag+jp+ef+jcr, mag+jxc, mag+jxc+jxc, mag+jxt, mag+xsn, maj, maj+jxc, maj+jxt, mma, mmd, nbn, nbn+jca, nbn+jca+jcj, nbn+jca+jcm, nbn+jca+jp+ef, nbn+jca+jxc, nbn+jca+jxt, nbn+jcc, nbn+jcj, nbn+jcm, nbn+jco, nbn+jcr, nbn+jcs, nbn+jct, nbn+jct+jcm, nbn+jct+jxt, nbn+jp+ecc, nbn+jp+ecs, nbn+jp+ecs+jca, nbn+jp+ecs+jcm, nbn+jp+ecs+jco, nbn+jp+ecs+jxc, nbn+jp+ecs+jxt, nbn+jp+ecx, nbn+jp+ef, nbn+jp+ef+jca, nbn+jp+ef+jco, nbn+jp+ef+jcr, nbn+jp+ef+jcr+jxc, nbn+jp+ef+jcr+jxt, nbn+jp+ef+jcs, nbn+jp+ef+jxc, nbn+jp+ef+jxc+jco, nbn+jp+ef+jxf, nbn+jp+ef+jxt, nbn+jp+ep+ecc, nbn+jp+ep+ecs, nbn+jp+ep+ecs+jxc, nbn+jp+ep+ef, nbn+jp+ep+ef+jcr, nbn+jp+ep+etm, nbn+jp+ep+etn, nbn+jp+ep+etn+jco, nbn+jp+ep+etn+jcs, nbn+jp+etm, nbn+jp+etn, nbn+jp+etn+jca, nbn+jp+etn+jca+jxt, nbn+jp+etn+jco, nbn+jp+etn+jcs, nbn+jp+etn+jxc, nbn+jp+etn+jxt, nbn+jxc, nbn+jxc+jca, nbn+jxc+jca+jxc, nbn+jxc+jca+jxt, nbn+jxc+jcc, nbn+jxc+jcm, nbn+jxc+jco, nbn+jxc+jcs, nbn+jxc+jp+ef, nbn+jxc+jxc, nbn+jxc+jxt, nbn+jxt, nbn+nbn, nbn+nbn+jp+ef, nbn+xsm+ecs, nbn+xsm+ef, nbn+xsm+ep+ef, nbn+xsm+ep+ef+jcr, nbn+xsm+etm, nbn+xsn, nbn+xsn+jca, nbn+xsn+jca+jp+ef+jcr, nbn+xsn+jca+jxc, nbn+xsn+jca+jxt, nbn+xsn+jcm, nbn+xsn+jco, nbn+xsn+jcs, nbn+xsn+jct, nbn+xsn+jp+ecc, nbn+xsn+jp+ecs, nbn+xsn+jp+ef, nbn+xsn+jp+ef+jcr, nbn+xsn+jp+ep+ef, nbn+xsn+jxc, nbn+xsn+jxt, nbn+xsv+etm, nbu, nbu+jca, nbu+jca+jxc, nbu+jca+jxt, nbu+jcc, nbu+jcc+jxc, nbu+jcj, nbu+jcm, nbu+jco, nbu+jcs, nbu+jct, nbu+jct+jxc, nbu+jp+ecc, nbu+jp+ecs, nbu+jp+ef, nbu+jp+ef+jcr, nbu+jp+ef+jxc, nbu+jp+ep+ecc, nbu+jp+ep+ecs, nbu+jp+ep+ef, nbu+jp+ep+ef+jcr, nbu+jp+ep+etm, nbu+jp+ep+etn+jco, nbu+jp+etm, nbu+jxc, nbu+jxc+jca, nbu+jxc+jcs, nbu+jxc+jp+ef, nbu+jxc+jp+ep+ef, nbu+jxc+jxt, nbu+jxt, nbu+ncn, nbu+ncn+jca, nbu+ncn+jcm, nbu+xsn, nbu+xsn+jca, nbu+xsn+jca+jxc, nbu+xsn+jca+jxt, nbu+xsn+jcm, nbu+xsn+jco, nbu+xsn+jcs, nbu+xsn+jp+ecs, nbu+xsn+jp+ep+ef, nbu+xsn+jxc, nbu+xsn+jxc+jxt, nbu+xsn+jxt, nbu+xsv+ecc, nbu+xsv+etm, ncn, ncn+f+ncpa+jco, ncn+jca, ncn+jca+jca, ncn+jca+jcc, ncn+jca+jcj, ncn+jca+jcm, ncn+jca+jcs, ncn+jca+jct, ncn+jca+jp+ecc, ncn+jca+jp+ecs, ncn+jca+jp+ef, ncn+jca+jp+ep+ef, ncn+jca+jp+etm, ncn+jca+jp+etn+jxt, ncn+jca+jxc, ncn+jca+jxc+jcc, ncn+jca+jxc+jcm, ncn+jca+jxc+jxc, ncn+jca+jxc+jxt, ncn+jca+jxt, ncn+jcc, ncn+jcc+jxc, ncn+jcj, ncn+jcj+jxt, ncn+jcm, ncn+jco, ncn+jcr, ncn+jcr+jxc, ncn+jcs, ncn+jcs+jxt, ncn+jct, ncn+jct+jcm, ncn+jct+jxc, ncn+jct+jxt, ncn+jcv, ncn+jp+ecc, ncn+jp+ecc+jct, ncn+jp+ecc+jxc, ncn+jp+ecs, ncn+jp+ecs+jcm, ncn+jp+ecs+jco, ncn+jp+ecs+jxc, ncn+jp+ecs+jxt, ncn+jp+ecx, ncn+jp+ef, ncn+jp+ef+jca, ncn+jp+ef+jcm, ncn+jp+ef+jco, ncn+jp+ef+jcr, ncn+jp+ef+jcr+jxc, ncn+jp+ef+jcr+jxt, ncn+jp+ef+jp+etm, ncn+jp+ef+jxc, ncn+jp+ef+jxf, ncn+jp+ef+jxt, ncn+jp+ep+ecc, ncn+jp+ep+ecs, ncn+jp+ep+ecs+jxc, ncn+jp+ep+ecx, ncn+jp+ep+ef, ncn+jp+ep+ef+jcr, ncn+jp+ep+ef+jcr+jxc, ncn+jp+ep+ef+jxc, ncn+jp+ep+ef+jxf, ncn+jp+ep+ef+jxt, ncn+jp+ep+ep+etm, ncn+jp+ep+etm, ncn+jp+ep+etn, ncn+jp+ep+etn+jca, ncn+jp+ep+etn+jca+jxc, ncn+jp+ep+etn+jco, ncn+jp+ep+etn+jcs, ncn+jp+ep+etn+jxt, ncn+jp+etm, ncn+jp+etn, ncn+jp+etn+jca, ncn+jp+etn+jca+jxc, ncn+jp+etn+jca+jxt, ncn+jp+etn+jco, ncn+jp+etn+jcs, ncn+jp+etn+jct, ncn+jp+etn+jxc, ncn+jp+etn+jxt, ncn+jxc, ncn+jxc+jca, ncn+jxc+jca+jxc, ncn+jxc+jca+jxt, ncn+jxc+jcc, ncn+jxc+jcm, ncn+jxc+jco, ncn+jxc+jcs, ncn+jxc+jct+jxt, ncn+jxc+jp+ef, ncn+jxc+jp+ef+jcr, ncn+jxc+jp+ep+ecs, ncn+jxc+jp+ep+ef, ncn+jxc+jp+etm, ncn+jxc+jxc, ncn+jxc+jxt, ncn+jxt, ncn+jxt+jcm, ncn+jxt+jxc, ncn+nbn, ncn+nbn+jca, ncn+nbn+jcm, ncn+nbn+jcs, ncn+nbn+jp+ecc, ncn+nbn+jp+ep+ef, ncn+nbn+jxc, ncn+nbn+jxt, ncn+nbu, ncn+nbu+jca, ncn+nbu+jcm, ncn+nbu+jco, ncn+nbu+jp+ef, ncn+nbu+jxc, ncn+nbu+ncn, ncn+ncn, ncn+ncn+jca, ncn+ncn+jca+jcc, ncn+ncn+jca+jcm, ncn+ncn+jca+jxc, ncn+ncn+jca+jxc+jcm, ncn+ncn+jca+jxc+jxc, ncn+ncn+jca+jxt, ncn+ncn+jcc, ncn+ncn+jcj, ncn+ncn+jcm, ncn+ncn+jco, ncn+ncn+jcr, ncn+ncn+jcs, ncn+ncn+jct, ncn+ncn+jct+jcm, ncn+ncn+jct+jxc, ncn+ncn+jct+jxt, ncn+ncn+jp+ecc, ncn+ncn+jp+ecs, ncn+ncn+jp+ef, ncn+ncn+jp+ef+jcm, ncn+ncn+jp+ef+jcr, ncn+ncn+jp+ef+jcs, ncn+ncn+jp+ep+ecc, ncn+ncn+jp+ep+ecs, ncn+ncn+jp+ep+ef, ncn+ncn+jp+ep+ef+jcr, ncn+ncn+jp+ep+ep+etm, ncn+ncn+jp+ep+etm, ncn+ncn+jp+ep+etn, ncn+ncn+jp+etm, ncn+ncn+jp+etn, ncn+ncn+jp+etn+jca, ncn+ncn+jp+etn+jco, ncn+ncn+jp+etn+jxc, ncn+ncn+jxc, ncn+ncn+jxc+jca, ncn+ncn+jxc+jcc, ncn+ncn+jxc+jcm, ncn+ncn+jxc+jco, ncn+ncn+jxc+jcs, ncn+ncn+jxc+jxc, ncn+ncn+jxt, ncn+ncn+nbn, ncn+ncn+ncn, ncn+ncn+ncn+jca, ncn+ncn+ncn+jca+jcm, ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+jcj, ncn+ncn+ncn+jcm, ncn+ncn+ncn+jco, ncn+ncn+ncn+jcs, ncn+ncn+ncn+jct+jxt, ncn+ncn+ncn+jp+etn+jxc, ncn+ncn+ncn+jxt, ncn+ncn+ncn+ncn+jca, ncn+ncn+ncn+ncn+jca+jxt, ncn+ncn+ncn+ncn+jco, ncn+ncn+ncn+xsn+jp+etm, ncn+ncn+ncpa, ncn+ncn+ncpa+jca, ncn+ncn+ncpa+jcm, ncn+ncn+ncpa+jco, ncn+ncn+ncpa+jcs, ncn+ncn+ncpa+jxc, ncn+ncn+ncpa+jxt, ncn+ncn+ncpa+ncn, ncn+ncn+ncpa+ncn+jca, ncn+ncn+ncpa+ncn+jcj, ncn+ncn+ncpa+ncn+jcm, ncn+ncn+ncpa+ncn+jxt, ncn+ncn+xsn, ncn+ncn+xsn+jca, ncn+ncn+xsn+jca+jxt, ncn+ncn+xsn+jcj, ncn+ncn+xsn+jcm, ncn+ncn+xsn+jco, ncn+ncn+xsn+jcs, ncn+ncn+xsn+jct, ncn+ncn+xsn+jp+ecs, ncn+ncn+xsn+jp+ep+ef, ncn+ncn+xsn+jp+etm, ncn+ncn+xsn+jxc, ncn+ncn+xsn+jxc+jcs, ncn+ncn+xsn+jxt, ncn+ncn+xsv+ecc, ncn+ncn+xsv+etm, ncn+ncpa, ncn+ncpa+jca, ncn+ncpa+jca+jcm, ncn+ncpa+jca+jxc, ncn+ncpa+jca+jxt, ncn+ncpa+jcc, ncn+ncpa+jcj, ncn+ncpa+jcm, ncn+ncpa+jco, ncn+ncpa+jcr, ncn+ncpa+jcs, ncn+ncpa+jct, ncn+ncpa+jct+jcm, ncn+ncpa+jct+jxt, ncn+ncpa+jp+ecc, ncn+ncpa+jp+ecc+jxc, ncn+ncpa+jp+ecs, ncn+ncpa+jp+ecs+jxc, ncn+ncpa+jp+ef, ncn+ncpa+jp+ef+jcr, ncn+ncpa+jp+ef+jcr+jxc, ncn+ncpa+jp+ep+ef, ncn+ncpa+jp+ep+etm, ncn+ncpa+jp+ep+etn, ncn+ncpa+jp+etm, ncn+ncpa+jxc, ncn+ncpa+jxc+jca+jxc, ncn+ncpa+jxc+jco, ncn+ncpa+jxc+jcs, ncn+ncpa+jxt, ncn+ncpa+nbn+jcs, ncn+ncpa+ncn, ncn+ncpa+ncn+jca, ncn+ncpa+ncn+jca+jcm, ncn+ncpa+ncn+jca+jxc, ncn+ncpa+ncn+jca+jxt, ncn+ncpa+ncn+jcj, ncn+ncpa+ncn+jcm, ncn+ncpa+ncn+jco, ncn+ncpa+ncn+jcs, ncn+ncpa+ncn+jct, ncn+ncpa+ncn+jct+jcm, ncn+ncpa+ncn+jp+ef+jcr, `ncn+ncpa+ncn+jp+ep+...
Read more

ja_core_news_trf-3.7.2

01 Oct 09:30
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: c278a19f126a705584206df5b25c773ffb45509fd0df11e38f86e34206a691f9
Checksum .whl: 85fb7bdb04bb7308ff8b728f6ecbceda198f1def857e6a922bd87ae089933d31

Details: https://spacy.io/models/ja#ja_core_news_trf

Japanese transformer pipeline (Transformer(name='cl-tohoku/bert-base-japanese-char-v2', piece_encoder='char', stride=160, type='bert', width=768, window=216, vocab_size=6144)). Components: transformer, morphologizer, parser, ner.

Feature Description
Name ja_core_news_trf
Version 3.7.2
spaCy >=3.7.0,<3.8.0
Default Pipeline transformer, morphologizer, parser, attribute_ruler, ner
Components transformer, morphologizer, parser, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD Japanese GSD v2.8 NER (Megagon Labs Tokyo)
cl-tohoku/bert-base-japanese-char-v2 (Inui Laboratory, Tohoku University)
License CC BY-SA 3.0
Author Explosion
Model size 320 MB

Label Scheme

View label scheme (64 labels for 3 components)
Component Labels
morphologizer POS=NOUN, POS=ADP, POS=VERB, POS=SCONJ, POS=AUX, POS=PUNCT, POS=PART, POS=DET, POS=NUM, POS=ADV, POS=PRON, POS=ADJ, POS=PROPN, POS=CCONJ, POS=SYM, POS=NOUN|Polarity=Neg, POS=AUX|Polarity=Neg, POS=INTJ, POS=SCONJ|Polarity=Neg
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.37
TOKEN_P 97.64
TOKEN_R 97.88
TOKEN_F 97.76
POS_ACC 97.94
MORPH_ACC 0.00
MORPH_MICRO_P 34.01
MORPH_MICRO_R 98.04
MORPH_MICRO_F 50.51
SENTS_P 93.18
SENTS_R 97.04
SENTS_F 95.07
DEP_UAS 93.05
DEP_LAS 91.78
TAG_ACC 97.13
LEMMA_ACC 96.70
ENTS_P 82.27
ENTS_R 84.65
ENTS_F 83.45

Installation

pip install spacy
python -m spacy download ja_core_news_trf

ja_core_news_sm-3.7.0

01 Oct 08:59
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 5cb1c87cd0551404a03fd630e824dfcb32e793f2ae5331a196d6e346749bdb2d
Checksum .whl: 1191e5bbffcc90670146616c274a64850e54d12070bc5846e78a094f2f6fcfca

Details: https://spacy.io/models/ja#ja_core_news_sm

Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.

Feature Description
Name ja_core_news_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, morphologizer, parser, attribute_ruler, ner
Components tok2vec, morphologizer, parser, senter, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD Japanese GSD v2.8 NER (Megagon Labs Tokyo)
License CC BY-SA 4.0
Author Explosion
Model size 11 MB

Label Scheme

View label scheme (65 labels for 3 components)
Component Labels
morphologizer POS=NOUN, POS=ADP, POS=VERB, POS=SCONJ, POS=AUX, POS=PUNCT, POS=PART, POS=DET, POS=NUM, POS=ADV, POS=PRON, POS=ADJ, POS=PROPN, POS=CCONJ, POS=SYM, POS=NOUN|Polarity=Neg, POS=AUX|Polarity=Neg, POS=SPACE, POS=INTJ, POS=SCONJ|Polarity=Neg
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.37
TOKEN_P 97.64
TOKEN_R 97.88
TOKEN_F 97.76
POS_ACC 96.13
MORPH_ACC 0.00
MORPH_MICRO_P 34.01
MORPH_MICRO_R 98.04
MORPH_MICRO_F 50.51
SENTS_P 98.04
SENTS_R 98.62
SENTS_F 98.33
DEP_UAS 91.95
DEP_LAS 90.48
TAG_ACC 97.13
LEMMA_ACC 96.70
ENTS_P 71.09
ENTS_R 57.23
ENTS_F 63.41

Installation

pip install spacy
python -m spacy download ja_core_news_sm