Releases: explosion/spacy-models
mk_core_news_md-3.7.0
Checksum .tar.gz:
ddffbbf27be644f8ba92a7ab70b35e2447639cfd65d6891a4ca05931ddbf88c6
Checksum .whl:06ab12bc3a833b7b7562aeb839740211a04f913be7562e1e8520fd440cc2e9de
Details: https://spacy.io/models/mk#mk_core_news_md
Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.
Feature | Description |
---|---|
Name | mk_core_news_md |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 274587 keys, 20000 unique vectors (300 dimensions) |
Sources | Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) spaCy lookups data (Explosion) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 42 MB |
Label Scheme
View label scheme (54 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=PROPN , POS=AUX , POS=ADJ , POS=NOUN , POS=ADP , POS=PUNCT , POS=CONJ , POS=NUM , POS=VERB , POS=PRON , POS=ADV , POS=SCONJ , POS=PART , POS=SYM , _ , POS=SPACE , POS=X , POS=INTJ |
parser |
ROOT , advmod , att , aux , cc , dep , det , dobj , iobj , neg , nsubj , pobj , poss , pozm , pozv , prep , punct , relcl |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
100.00 |
TOKEN_P |
100.00 |
TOKEN_R |
100.00 |
TOKEN_F |
100.00 |
SENTS_P |
80.00 |
SENTS_R |
67.53 |
SENTS_F |
73.24 |
DEP_UAS |
67.71 |
DEP_LAS |
52.01 |
ENTS_P |
74.72 |
ENTS_R |
74.47 |
ENTS_F |
74.60 |
POS_ACC |
92.61 |
Installation
pip install spacy
python -m spacy download mk_core_news_md
mk_core_news_lg-3.7.0
Checksum .tar.gz:
1d537095831c7861e9a3ccddc05da9b95c71b4f37f7627f15a6c4304e4f5503b
Checksum .whl:3fcd0244d5b2d5b4f74ccff27b495814bfefcfb5e7c96b49a44e86a51d2e0408
Details: https://spacy.io/models/mk#mk_core_news_lg
Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.
Feature | Description |
---|---|
Name | mk_core_news_lg |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 274587 keys, 274587 unique vectors (300 dimensions) |
Sources | Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) spaCy lookups data (Explosion) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 310 MB |
Label Scheme
View label scheme (54 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=PROPN , POS=AUX , POS=ADJ , POS=NOUN , POS=ADP , POS=PUNCT , POS=CONJ , POS=NUM , POS=VERB , POS=PRON , POS=ADV , POS=SCONJ , POS=PART , POS=SYM , _ , POS=SPACE , POS=X , POS=INTJ |
parser |
ROOT , advmod , att , aux , cc , dep , det , dobj , iobj , neg , nsubj , pobj , poss , pozm , pozv , prep , punct , relcl |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
100.00 |
TOKEN_P |
100.00 |
TOKEN_R |
100.00 |
TOKEN_F |
100.00 |
SENTS_P |
70.42 |
SENTS_R |
64.94 |
SENTS_F |
67.57 |
DEP_UAS |
67.84 |
DEP_LAS |
52.98 |
ENTS_P |
75.06 |
ENTS_R |
75.06 |
ENTS_F |
75.06 |
POS_ACC |
93.09 |
Installation
pip install spacy
python -m spacy download mk_core_news_lg
lt_core_news_sm-3.7.0
Checksum .tar.gz:
4b3af0a6e692325b1368d17a94ab8d68d654215d84c5aaac58820dc3da5504c8
Checksum .whl:8ff55726043b411b9547824b465487901377ac12e0511b2ed579cec8d111fe69
Details: https://spacy.io/models/lt#lt_core_news_sm
Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | lt_core_news_sm |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , tagger , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , morphologizer , tagger , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta) TokenMill NER Corpus (TokenMill) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 12 MB |
Label Scheme
View label scheme (1669 labels for 4 components)
Component | Labels |
---|---|
morphologizer |
Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , POS=VERB|Polarity=Pos|VerbForm=Inf , Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Gen|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Gender=Masc|Number=Plur|POS=NOUN , Case=Acc|Gender=Masc|Number=Plur|POS=NOUN , POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger , Case=Gen|Gender=Masc|Number=Sing|POS=NOUN , POS=CCONJ , POS=PUNCT , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Acc|Gender=Fem|Number=Plur|POS=NOUN , Case=Loc|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act , Case=Acc|Gender=Masc|Number=Sing|POS=NOUN , Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Acc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Gender=Fem|Number=Sing|POS=NOUN , Case=Nom|Gender=Masc|Number=Sing|POS=NOUN , Abbr=Yes|POS=X , AdpType=Prep|Case=Gen|POS=ADP , Case=Gen|Gender=Masc|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Nom|Gender=Fem|Number=Plur|POS=NOUN , Case=Ins|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Plur|POS=NOUN , Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Degree=Pos|POS=ADV , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Degree=Pos|Hyph=Yes|POS=ADV , Hyph=Yes|POS=X , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , POS=SCONJ , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Ind|POS=PRON|PronType=Ind , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Gender=Fem|Number=Sing|POS=NOUN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|POS=PRON|PronType=Ind , POS=PART , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem , Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM , Case=Ins|Gender=Masc|Number=Plur|POS=NOUN , Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Definite=Ind|Gender=Neut|POS=DET|PronType=Dem , Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Sing|POS=PROPN , Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Loc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger , Case=Dat|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf , Degree=Cmp|POS=ADV , Case=Gen|Gender=Fem|Number=Sing|POS=PROPN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Definite=Ind|NumForm=Digit|POS=NUM , Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM , Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass , Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , NumForm=Word|NumType=Card|POS=NUM , Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem , Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel , Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Dat|Gender=Masc|Number=Plur|POS=NOUN , Case=Nom|Gender=Fem|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Hyph=Yes|POS=PART , Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Loc|Gender=Masc|Number=Sing|POS=NOUN , AdpType=Prep|Case=Acc|POS=ADP , Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Definite=Ind|NumForm=Roman|POS=NUM , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part , Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ , Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Nom|Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Int,Rel , Degree=Sup|POS=ADV , `Case=Nom|Definite=Ind|Gender=Fem|Number=Plur|POS=VE... |
lt_core_news_md-3.7.0
Checksum .tar.gz:
e8a5ba3577190133eaa9a9a6764dad4a40a72008021a59f04942135e64d42784
Checksum .whl:bf793c09c47795fad7fe1a4d310a885abf6df662499ffa4be6102649782a4f82
Details: https://spacy.io/models/lt#lt_core_news_md
Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | lt_core_news_md |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , tagger , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , morphologizer , tagger , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | 500000 keys, 20000 unique vectors (300 dimensions) |
Sources | UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta) TokenMill NER Corpus (TokenMill) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 40 MB |
Label Scheme
View label scheme (1669 labels for 4 components)
Component | Labels |
---|---|
morphologizer |
Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , POS=VERB|Polarity=Pos|VerbForm=Inf , Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Gen|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Gender=Masc|Number=Plur|POS=NOUN , Case=Acc|Gender=Masc|Number=Plur|POS=NOUN , POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger , Case=Gen|Gender=Masc|Number=Sing|POS=NOUN , POS=CCONJ , POS=PUNCT , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Acc|Gender=Fem|Number=Plur|POS=NOUN , Case=Loc|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act , Case=Acc|Gender=Masc|Number=Sing|POS=NOUN , Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Acc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Gender=Fem|Number=Sing|POS=NOUN , Case=Nom|Gender=Masc|Number=Sing|POS=NOUN , Abbr=Yes|POS=X , AdpType=Prep|Case=Gen|POS=ADP , Case=Gen|Gender=Masc|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Nom|Gender=Fem|Number=Plur|POS=NOUN , Case=Ins|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Plur|POS=NOUN , Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Degree=Pos|POS=ADV , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Degree=Pos|Hyph=Yes|POS=ADV , Hyph=Yes|POS=X , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , POS=SCONJ , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Ind|POS=PRON|PronType=Ind , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Gender=Fem|Number=Sing|POS=NOUN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|POS=PRON|PronType=Ind , POS=PART , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem , Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM , Case=Ins|Gender=Masc|Number=Plur|POS=NOUN , Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Definite=Ind|Gender=Neut|POS=DET|PronType=Dem , Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Sing|POS=PROPN , Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Loc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger , Case=Dat|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf , Degree=Cmp|POS=ADV , Case=Gen|Gender=Fem|Number=Sing|POS=PROPN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Definite=Ind|NumForm=Digit|POS=NUM , Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM , Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass , Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , NumForm=Word|NumType=Card|POS=NUM , Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem , Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel , Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Dat|Gender=Masc|Number=Plur|POS=NOUN , Case=Nom|Gender=Fem|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Hyph=Yes|POS=PART , Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Loc|Gender=Masc|Number=Sing|POS=NOUN , AdpType=Prep|Case=Acc|POS=ADP , Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Definite=Ind|NumForm=Roman|POS=NUM , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part , Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ , Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , `Case=Nom|Definite=Ind|Gender=Fem|Number... |
lt_core_news_lg-3.7.0
Checksum .tar.gz:
8fd6f9fd4b0fce21a8492882e945d8cac723011bba20cd301103734b4544979c
Checksum .whl:ee805642db51d8324a63ed313f2a146357521fb717fe207e818afa06b4e374f1
Details: https://spacy.io/models/lt#lt_core_news_lg
Lithuanian pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | lt_core_news_lg |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , tagger , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , morphologizer , tagger , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | 500000 keys, 500000 unique vectors (300 dimensions) |
Sources | UD Lithuanian ALKSNIS v2.8 (Utka, Andrius; Rimkutė, Erika; Bielinskienė, Agnė; Kovalevskaitė, Jolanta; Boizou, Loïc; Aleksandravičiūtė, Gabrielė; Brokaitė, Kristina; Zeman, Daniel; Perkova, Natalia; Griciūtė, Bernadeta) TokenMill NER Corpus (TokenMill) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 541 MB |
Label Scheme
View label scheme (1669 labels for 4 components)
Component | Labels |
---|---|
morphologizer |
Definite=Ind|Gender=Neut|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , POS=VERB|Polarity=Pos|VerbForm=Inf , Case=Gen|Definite=Def|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Gen|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Gender=Masc|Number=Plur|POS=NOUN , Case=Acc|Gender=Masc|Number=Plur|POS=NOUN , POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Ger , Case=Gen|Gender=Masc|Number=Sing|POS=NOUN , POS=CCONJ , POS=PUNCT , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Acc|Gender=Fem|Number=Plur|POS=NOUN , Case=Loc|Gender=Fem|Number=Plur|POS=NOUN , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Act , Case=Acc|Gender=Masc|Number=Sing|POS=NOUN , Case=Acc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Acc|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Acc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Gender=Fem|Number=Sing|POS=NOUN , Case=Nom|Gender=Masc|Number=Sing|POS=NOUN , Abbr=Yes|POS=X , AdpType=Prep|Case=Gen|POS=ADP , Case=Gen|Gender=Masc|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , Case=Nom|Gender=Fem|Number=Plur|POS=NOUN , Case=Ins|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Mood=Cnd|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Plur|POS=NOUN , Mood=Ind|Number=Plur|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Degree=Pos|POS=ADV , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Degree=Pos|Hyph=Yes|POS=ADV , Hyph=Yes|POS=X , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , POS=SCONJ , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Ind|POS=PRON|PronType=Ind , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Gender=Fem|Number=Sing|POS=NOUN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Acc|Definite=Ind|POS=PRON|PronType=Ind , POS=PART , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Reflex=Yes|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Dem , Case=Ins|Gender=Masc|NumForm=Word|NumType=Card|POS=NUM , Case=Ins|Gender=Masc|Number=Plur|POS=NOUN , Case=Ins|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Definite=Ind|Gender=Neut|POS=DET|PronType=Dem , Mood=Ind|POS=AUX|Person=3|Polarity=Pos|Tense=Pres|VerbForm=Fin , Definite=Ind|Degree=Pos|Gender=Neut|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|Number=Sing|POS=PROPN , Case=Loc|Definite=Ind|Gender=Fem|Number=Sing|POS=VERB|Polarity=Pos|Tense=Pres|VerbForm=Part|Voice=Pass , Case=Gen|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Loc|Gender=Fem|Number=Sing|POS=NOUN , Aspect=Perf|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Ger , Case=Dat|Gender=Masc|Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , POS=VERB|Polarity=Pos|Reflex=Yes|VerbForm=Inf , Degree=Cmp|POS=ADV , Case=Gen|Gender=Fem|Number=Sing|POS=PROPN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Pos|Tense=Fut|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Mood=Ind|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Case=Gen|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Definite=Ind|NumForm=Digit|POS=NUM , Case=Gen|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , Case=Nom|Definite=Ind|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Acc|Gender=Masc|NumForm=Word|NumType=Card|Number=Sing|POS=NUM , Case=Dat|Definite=Ind|Number=Sing|POS=PRON|Person=1|PronType=Prs , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=DET|PronType=Tot , Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Pass , Case=Loc|Definite=Ind|Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Case=Nom|Gender=Masc|NumForm=Word|NumType=Card|Number=Plur|POS=NUM , NumForm=Word|NumType=Card|POS=NUM , Case=Nom|Definite=Ind|Gender=Fem|Hyph=Yes|Number=Plur|POS=DET|PronType=Dem , Mood=Ind|Number=Sing|POS=VERB|Person=1|Polarity=Pos|Tense=Pres|VerbForm=Fin , Case=Nom|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing|POS=ADJ , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Int,Rel , Case=Acc|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Dat|Gender=Masc|Number=Plur|POS=NOUN , Case=Nom|Gender=Fem|Number=Sing|POS=PROPN , Case=Nom|Definite=Ind|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Hyph=Yes|POS=PART , Mood=Cnd|Number=Sing|POS=AUX|Person=3|Polarity=Pos|VerbForm=Fin , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Loc|Gender=Masc|Number=Sing|POS=NOUN , AdpType=Prep|Case=Acc|POS=ADP , Mood=Cnd|Number=Sing|POS=VERB|Person=3|Polarity=Pos|VerbForm=Fin , Case=Gen|Definite=Def|Gender=Fem|NumForm=Combi|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Def|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Aspect=Perf|Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Case=Acc|Definite=Ind|Degree=Pos|Gender=Fem|Number=Sing|POS=ADJ , Mood=Ind|Number=Plur|POS=VERB|Person=3|Polarity=Neg|Tense=Pres|VerbForm=Fin , Definite=Ind|NumForm=Roman|POS=NUM , Case=Gen|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Case=Gen|Definite=Ind|Degree=Pos|Gender=Masc|Number=Plur|POS=ADJ , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Gender=Fem|NumForm=Word|NumType=Ord|Number=Sing|POS=NUM , Case=Nom|Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Dem , Case=Nom|Definite=Ind|Gender=Masc|Mood=Nec|Number=Sing|POS=VERB|Polarity=Pos|VerbForm=Part , Case=Nom|Definite=Ind|Degree=Cmp|Gender=Fem|Number=Plur|POS=ADJ , Aspect=Perf|Case=Acc|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Reflex=Yes|Tense=Past|VerbForm=Part|Voice=Act , Case=Dat|Definite=Ind|Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Aspect=Perf|Mood=Ind|POS=VERB|Person=3|Polarity=Pos|Tense=Past|VerbForm=Fin , Aspect=Perf|Case=Nom|Definite=Ind|Gender=Masc|Number=Plur|POS=VERB|Polarity=Pos|Tense=Past|VerbForm=Part|Voice=Act , Case=Gen|Definite=Ind|Degree=Pos|Gender=Fem|Number=Plur|POS=ADJ , `Case=Nom|Definite=Ind|Gender=Fem|Numb... |
ko_core_news_sm-3.7.0
Checksum .tar.gz:
f0560bae70204fbe3d977ec98f9355a80e4b984b2ddf76bfd2f02137a3a6a19a
Checksum .whl:b1a15a4987a8f9835031a6bd2fe57fe158097ab5304221c41df1bd4aab8cf458
Details: https://spacy.io/models/ko#ko_core_news_sm
Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | ko_core_news_sm |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , tagger , morphologizer , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , tagger , morphologizer , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol) KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 13 MB |
Label Scheme
View label scheme (2028 labels for 4 components)
Component | Labels |
---|---|
tagger |
_SP , ecs , etm , f , f+f+jcj , f+f+jcs , f+f+jct , f+f+jxt , f+jca , f+jca+jp+ecc , f+jca+jp+ep+ef , f+jca+jxc , f+jca+jxc+jcm , f+jca+jxt , f+jcj , f+jcm , f+jco , f+jcs , f+jct , f+jct+jcm , f+jp+ef , f+jp+ep+ef , f+jp+etm , f+jxc , f+jxt , f+ncn , f+ncn+jcm , f+ncn+jcs , f+ncn+jp+ecc , f+ncn+jxt , f+ncpa+jcm , f+npp+jcs , f+nq , f+xsn , f+xsn+jco , f+xsn+jxt , ii , jca , jca+jcm , jca+jxc , jca+jxt , jcc , jcj , jcm , jco , jcr , jcr+jxc , jcs , jct , jct+jcm , jct+jxt , jp+ecc , jp+ecs , jp+ef , jp+ef+jcr , jp+ef+jcr+jxc , jp+ep+ecs , jp+ep+ef , jp+ep+etm , jp+ep+etn , jp+etm , jp+etn , jp+etn+jco , jp+etn+jxc , jxc , jxc+jca , jxc+jco , jxc+jcs , jxt , mad , mad+jxc , mad+jxt , mag , mag+jca , mag+jcm , mag+jcs , mag+jp+ef+jcr , mag+jxc , mag+jxc+jxc , mag+jxt , mag+xsn , maj , maj+jxc , maj+jxt , mma , mmd , nbn , nbn+jca , nbn+jca+jcj , nbn+jca+jcm , nbn+jca+jp+ef , nbn+jca+jxc , nbn+jca+jxt , nbn+jcc , nbn+jcj , nbn+jcm , nbn+jco , nbn+jcr , nbn+jcs , nbn+jct , nbn+jct+jcm , nbn+jct+jxt , nbn+jp+ecc , nbn+jp+ecs , nbn+jp+ecs+jca , nbn+jp+ecs+jcm , nbn+jp+ecs+jco , nbn+jp+ecs+jxc , nbn+jp+ecs+jxt , nbn+jp+ecx , nbn+jp+ef , nbn+jp+ef+jca , nbn+jp+ef+jco , nbn+jp+ef+jcr , nbn+jp+ef+jcr+jxc , nbn+jp+ef+jcr+jxt , nbn+jp+ef+jcs , nbn+jp+ef+jxc , nbn+jp+ef+jxc+jco , nbn+jp+ef+jxf , nbn+jp+ef+jxt , nbn+jp+ep+ecc , nbn+jp+ep+ecs , nbn+jp+ep+ecs+jxc , nbn+jp+ep+ef , nbn+jp+ep+ef+jcr , nbn+jp+ep+etm , nbn+jp+ep+etn , nbn+jp+ep+etn+jco , nbn+jp+ep+etn+jcs , nbn+jp+etm , nbn+jp+etn , nbn+jp+etn+jca , nbn+jp+etn+jca+jxt , nbn+jp+etn+jco , nbn+jp+etn+jcs , nbn+jp+etn+jxc , nbn+jp+etn+jxt , nbn+jxc , nbn+jxc+jca , nbn+jxc+jca+jxc , nbn+jxc+jca+jxt , nbn+jxc+jcc , nbn+jxc+jcm , nbn+jxc+jco , nbn+jxc+jcs , nbn+jxc+jp+ef , nbn+jxc+jxc , nbn+jxc+jxt , nbn+jxt , nbn+nbn , nbn+nbn+jp+ef , nbn+xsm+ecs , nbn+xsm+ef , nbn+xsm+ep+ef , nbn+xsm+ep+ef+jcr , nbn+xsm+etm , nbn+xsn , nbn+xsn+jca , nbn+xsn+jca+jp+ef+jcr , nbn+xsn+jca+jxc , nbn+xsn+jca+jxt , nbn+xsn+jcm , nbn+xsn+jco , nbn+xsn+jcs , nbn+xsn+jct , nbn+xsn+jp+ecc , nbn+xsn+jp+ecs , nbn+xsn+jp+ef , nbn+xsn+jp+ef+jcr , nbn+xsn+jp+ep+ef , nbn+xsn+jxc , nbn+xsn+jxt , nbn+xsv+etm , nbu , nbu+jca , nbu+jca+jxc , nbu+jca+jxt , nbu+jcc , nbu+jcc+jxc , nbu+jcj , nbu+jcm , nbu+jco , nbu+jcs , nbu+jct , nbu+jct+jxc , nbu+jp+ecc , nbu+jp+ecs , nbu+jp+ef , nbu+jp+ef+jcr , nbu+jp+ef+jxc , nbu+jp+ep+ecc , nbu+jp+ep+ecs , nbu+jp+ep+ef , nbu+jp+ep+ef+jcr , nbu+jp+ep+etm , nbu+jp+ep+etn+jco , nbu+jp+etm , nbu+jxc , nbu+jxc+jca , nbu+jxc+jcs , nbu+jxc+jp+ef , nbu+jxc+jp+ep+ef , nbu+jxc+jxt , nbu+jxt , nbu+ncn , nbu+ncn+jca , nbu+ncn+jcm , nbu+xsn , nbu+xsn+jca , nbu+xsn+jca+jxc , nbu+xsn+jca+jxt , nbu+xsn+jcm , nbu+xsn+jco , nbu+xsn+jcs , nbu+xsn+jp+ecs , nbu+xsn+jp+ep+ef , nbu+xsn+jxc , nbu+xsn+jxc+jxt , nbu+xsn+jxt , nbu+xsv+ecc , nbu+xsv+etm , ncn , ncn+f+ncpa+jco , ncn+jca , ncn+jca+jca , ncn+jca+jcc , ncn+jca+jcj , ncn+jca+jcm , ncn+jca+jcs , ncn+jca+jct , ncn+jca+jp+ecc , ncn+jca+jp+ecs , ncn+jca+jp+ef , ncn+jca+jp+ep+ef , ncn+jca+jp+etm , ncn+jca+jp+etn+jxt , ncn+jca+jxc , ncn+jca+jxc+jcc , ncn+jca+jxc+jcm , ncn+jca+jxc+jxc , ncn+jca+jxc+jxt , ncn+jca+jxt , ncn+jcc , ncn+jcc+jxc , ncn+jcj , ncn+jcj+jxt , ncn+jcm , ncn+jco , ncn+jcr , ncn+jcr+jxc , ncn+jcs , ncn+jcs+jxt , ncn+jct , ncn+jct+jcm , ncn+jct+jxc , ncn+jct+jxt , ncn+jcv , ncn+jp+ecc , ncn+jp+ecc+jct , ncn+jp+ecc+jxc , ncn+jp+ecs , ncn+jp+ecs+jcm , ncn+jp+ecs+jco , ncn+jp+ecs+jxc , ncn+jp+ecs+jxt , ncn+jp+ecx , ncn+jp+ef , ncn+jp+ef+jca , ncn+jp+ef+jcm , ncn+jp+ef+jco , ncn+jp+ef+jcr , ncn+jp+ef+jcr+jxc , ncn+jp+ef+jcr+jxt , ncn+jp+ef+jp+etm , ncn+jp+ef+jxc , ncn+jp+ef+jxf , ncn+jp+ef+jxt , ncn+jp+ep+ecc , ncn+jp+ep+ecs , ncn+jp+ep+ecs+jxc , ncn+jp+ep+ecx , ncn+jp+ep+ef , ncn+jp+ep+ef+jcr , ncn+jp+ep+ef+jcr+jxc , ncn+jp+ep+ef+jxc , ncn+jp+ep+ef+jxf , ncn+jp+ep+ef+jxt , ncn+jp+ep+ep+etm , ncn+jp+ep+etm , ncn+jp+ep+etn , ncn+jp+ep+etn+jca , ncn+jp+ep+etn+jca+jxc , ncn+jp+ep+etn+jco , ncn+jp+ep+etn+jcs , ncn+jp+ep+etn+jxt , ncn+jp+etm , ncn+jp+etn , ncn+jp+etn+jca , ncn+jp+etn+jca+jxc , ncn+jp+etn+jca+jxt , ncn+jp+etn+jco , ncn+jp+etn+jcs , ncn+jp+etn+jct , ncn+jp+etn+jxc , ncn+jp+etn+jxt , ncn+jxc , ncn+jxc+jca , ncn+jxc+jca+jxc , ncn+jxc+jca+jxt , ncn+jxc+jcc , ncn+jxc+jcm , ncn+jxc+jco , ncn+jxc+jcs , ncn+jxc+jct+jxt , ncn+jxc+jp+ef , ncn+jxc+jp+ef+jcr , ncn+jxc+jp+ep+ecs , ncn+jxc+jp+ep+ef , ncn+jxc+jp+etm , ncn+jxc+jxc , ncn+jxc+jxt , ncn+jxt , ncn+jxt+jcm , ncn+jxt+jxc , ncn+nbn , ncn+nbn+jca , ncn+nbn+jcm , ncn+nbn+jcs , ncn+nbn+jp+ecc , ncn+nbn+jp+ep+ef , ncn+nbn+jxc , ncn+nbn+jxt , ncn+nbu , ncn+nbu+jca , ncn+nbu+jcm , ncn+nbu+jco , ncn+nbu+jp+ef , ncn+nbu+jxc , ncn+nbu+ncn , ncn+ncn , ncn+ncn+jca , ncn+ncn+jca+jcc , ncn+ncn+jca+jcm , ncn+ncn+jca+jxc , ncn+ncn+jca+jxc+jcm , ncn+ncn+jca+jxc+jxc , ncn+ncn+jca+jxt , ncn+ncn+jcc , ncn+ncn+jcj , ncn+ncn+jcm , ncn+ncn+jco , ncn+ncn+jcr , ncn+ncn+jcs , ncn+ncn+jct , ncn+ncn+jct+jcm , ncn+ncn+jct+jxc , ncn+ncn+jct+jxt , ncn+ncn+jp+ecc , ncn+ncn+jp+ecs , ncn+ncn+jp+ef , ncn+ncn+jp+ef+jcm , ncn+ncn+jp+ef+jcr , ncn+ncn+jp+ef+jcs , ncn+ncn+jp+ep+ecc , ncn+ncn+jp+ep+ecs , ncn+ncn+jp+ep+ef , ncn+ncn+jp+ep+ef+jcr , ncn+ncn+jp+ep+ep+etm , ncn+ncn+jp+ep+etm , ncn+ncn+jp+ep+etn , ncn+ncn+jp+etm , ncn+ncn+jp+etn , ncn+ncn+jp+etn+jca , ncn+ncn+jp+etn+jco , ncn+ncn+jp+etn+jxc , ncn+ncn+jxc , ncn+ncn+jxc+jca , ncn+ncn+jxc+jcc , ncn+ncn+jxc+jcm , ncn+ncn+jxc+jco , ncn+ncn+jxc+jcs , ncn+ncn+jxc+jxc , ncn+ncn+jxt , ncn+ncn+nbn , ncn+ncn+ncn , ncn+ncn+ncn+jca , ncn+ncn+ncn+jca+jcm , ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+jcj , ncn+ncn+ncn+jcm , ncn+ncn+ncn+jco , ncn+ncn+ncn+jcs , ncn+ncn+ncn+jct+jxt , ncn+ncn+ncn+jp+etn+jxc , ncn+ncn+ncn+jxt , ncn+ncn+ncn+ncn+jca , ncn+ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+ncn+jco , ncn+ncn+ncn+xsn+jp+etm , ncn+ncn+ncpa , ncn+ncn+ncpa+jca , ncn+ncn+ncpa+jcm , ncn+ncn+ncpa+jco , ncn+ncn+ncpa+jcs , ncn+ncn+ncpa+jxc , ncn+ncn+ncpa+jxt , ncn+ncn+ncpa+ncn , ncn+ncn+ncpa+ncn+jca , ncn+ncn+ncpa+ncn+jcj , ncn+ncn+ncpa+ncn+jcm , ncn+ncn+ncpa+ncn+jxt , ncn+ncn+xsn , ncn+ncn+xsn+jca , ncn+ncn+xsn+jca+jxt , ncn+ncn+xsn+jcj , ncn+ncn+xsn+jcm , ncn+ncn+xsn+jco , ncn+ncn+xsn+jcs , ncn+ncn+xsn+jct , ncn+ncn+xsn+jp+ecs , ncn+ncn+xsn+jp+ep+ef , ncn+ncn+xsn+jp+etm , ncn+ncn+xsn+jxc , ncn+ncn+xsn+jxc+jcs , ncn+ncn+xsn+jxt , ncn+ncn+xsv+ecc , ncn+ncn+xsv+etm , ncn+ncpa , ncn+ncpa+jca , ncn+ncpa+jca+jcm , ncn+ncpa+jca+jxc , ncn+ncpa+jca+jxt , ncn+ncpa+jcc , ncn+ncpa+jcj , ncn+ncpa+jcm , ncn+ncpa+jco , ncn+ncpa+jcr , ncn+ncpa+jcs , ncn+ncpa+jct , ncn+ncpa+jct+jcm , ncn+ncpa+jct+jxt , ncn+ncpa+jp+ecc , ncn+ncpa+jp+ecc+jxc , ncn+ncpa+jp+ecs , ncn+ncpa+jp+ecs+jxc , ncn+ncpa+jp+ef , ncn+ncpa+jp+ef+jcr , ncn+ncpa+jp+ef+jcr+jxc , ncn+ncpa+jp+ep+ef , ncn+ncpa+jp+ep+etm , ncn+ncpa+jp+ep+etn , ncn+ncpa+jp+etm , ncn+ncpa+jxc , ncn+ncpa+jxc+jca+jxc , ncn+ncpa+jxc+jco , ncn+ncpa+jxc+jcs , ncn+ncpa+jxt , ncn+ncpa+nbn+jcs , ncn+ncpa+ncn , ncn+ncpa+ncn+jca , ncn+ncpa+ncn+jca+jcm , ncn+ncpa+ncn+jca+jxc , ncn+ncpa+ncn+jca+jxt , ncn+ncpa+ncn+jcj , ncn+ncpa+ncn+jcm , ncn+ncpa+ncn+jco , ncn+ncpa+ncn+jcs , ncn+ncpa+ncn+jct , ncn+ncpa+ncn+jct+jcm , ncn+ncpa+ncn+jp+ef+jcr , ncn+ncpa+ncn+jp+ep+etm , ncn+ncpa+ncn+jxc , ncn+ncpa+ncn+jxt , ncn+ncpa+ncn+xsn+jcm , ncn+ncpa+ncn+xsn+jxt , ncn+ncpa+ncpa , `ncn+ncpa+ncpa+jca... |
ko_core_news_md-3.7.0
Checksum .tar.gz:
35eb302d8b80d6a0d5cf4a33682f13773a920b0f800ae41b90f93054a66727aa
Checksum .whl:97565293b1916eb20a47ec9bc96ee58aa8c334787d7a30b0efbea63ba6205165
Details: https://spacy.io/models/ko#ko_core_news_md
Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | ko_core_news_md |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , tagger , morphologizer , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , tagger , morphologizer , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | floret (50000, 300) |
Sources | UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol) KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho) Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 65 MB |
Label Scheme
View label scheme (2028 labels for 4 components)
Component | Labels |
---|---|
tagger |
_SP , ecs , etm , f , f+f+jcj , f+f+jcs , f+f+jct , f+f+jxt , f+jca , f+jca+jp+ecc , f+jca+jp+ep+ef , f+jca+jxc , f+jca+jxc+jcm , f+jca+jxt , f+jcj , f+jcm , f+jco , f+jcs , f+jct , f+jct+jcm , f+jp+ef , f+jp+ep+ef , f+jp+etm , f+jxc , f+jxt , f+ncn , f+ncn+jcm , f+ncn+jcs , f+ncn+jp+ecc , f+ncn+jxt , f+ncpa+jcm , f+npp+jcs , f+nq , f+xsn , f+xsn+jco , f+xsn+jxt , ii , jca , jca+jcm , jca+jxc , jca+jxt , jcc , jcj , jcm , jco , jcr , jcr+jxc , jcs , jct , jct+jcm , jct+jxt , jp+ecc , jp+ecs , jp+ef , jp+ef+jcr , jp+ef+jcr+jxc , jp+ep+ecs , jp+ep+ef , jp+ep+etm , jp+ep+etn , jp+etm , jp+etn , jp+etn+jco , jp+etn+jxc , jxc , jxc+jca , jxc+jco , jxc+jcs , jxt , mad , mad+jxc , mad+jxt , mag , mag+jca , mag+jcm , mag+jcs , mag+jp+ef+jcr , mag+jxc , mag+jxc+jxc , mag+jxt , mag+xsn , maj , maj+jxc , maj+jxt , mma , mmd , nbn , nbn+jca , nbn+jca+jcj , nbn+jca+jcm , nbn+jca+jp+ef , nbn+jca+jxc , nbn+jca+jxt , nbn+jcc , nbn+jcj , nbn+jcm , nbn+jco , nbn+jcr , nbn+jcs , nbn+jct , nbn+jct+jcm , nbn+jct+jxt , nbn+jp+ecc , nbn+jp+ecs , nbn+jp+ecs+jca , nbn+jp+ecs+jcm , nbn+jp+ecs+jco , nbn+jp+ecs+jxc , nbn+jp+ecs+jxt , nbn+jp+ecx , nbn+jp+ef , nbn+jp+ef+jca , nbn+jp+ef+jco , nbn+jp+ef+jcr , nbn+jp+ef+jcr+jxc , nbn+jp+ef+jcr+jxt , nbn+jp+ef+jcs , nbn+jp+ef+jxc , nbn+jp+ef+jxc+jco , nbn+jp+ef+jxf , nbn+jp+ef+jxt , nbn+jp+ep+ecc , nbn+jp+ep+ecs , nbn+jp+ep+ecs+jxc , nbn+jp+ep+ef , nbn+jp+ep+ef+jcr , nbn+jp+ep+etm , nbn+jp+ep+etn , nbn+jp+ep+etn+jco , nbn+jp+ep+etn+jcs , nbn+jp+etm , nbn+jp+etn , nbn+jp+etn+jca , nbn+jp+etn+jca+jxt , nbn+jp+etn+jco , nbn+jp+etn+jcs , nbn+jp+etn+jxc , nbn+jp+etn+jxt , nbn+jxc , nbn+jxc+jca , nbn+jxc+jca+jxc , nbn+jxc+jca+jxt , nbn+jxc+jcc , nbn+jxc+jcm , nbn+jxc+jco , nbn+jxc+jcs , nbn+jxc+jp+ef , nbn+jxc+jxc , nbn+jxc+jxt , nbn+jxt , nbn+nbn , nbn+nbn+jp+ef , nbn+xsm+ecs , nbn+xsm+ef , nbn+xsm+ep+ef , nbn+xsm+ep+ef+jcr , nbn+xsm+etm , nbn+xsn , nbn+xsn+jca , nbn+xsn+jca+jp+ef+jcr , nbn+xsn+jca+jxc , nbn+xsn+jca+jxt , nbn+xsn+jcm , nbn+xsn+jco , nbn+xsn+jcs , nbn+xsn+jct , nbn+xsn+jp+ecc , nbn+xsn+jp+ecs , nbn+xsn+jp+ef , nbn+xsn+jp+ef+jcr , nbn+xsn+jp+ep+ef , nbn+xsn+jxc , nbn+xsn+jxt , nbn+xsv+etm , nbu , nbu+jca , nbu+jca+jxc , nbu+jca+jxt , nbu+jcc , nbu+jcc+jxc , nbu+jcj , nbu+jcm , nbu+jco , nbu+jcs , nbu+jct , nbu+jct+jxc , nbu+jp+ecc , nbu+jp+ecs , nbu+jp+ef , nbu+jp+ef+jcr , nbu+jp+ef+jxc , nbu+jp+ep+ecc , nbu+jp+ep+ecs , nbu+jp+ep+ef , nbu+jp+ep+ef+jcr , nbu+jp+ep+etm , nbu+jp+ep+etn+jco , nbu+jp+etm , nbu+jxc , nbu+jxc+jca , nbu+jxc+jcs , nbu+jxc+jp+ef , nbu+jxc+jp+ep+ef , nbu+jxc+jxt , nbu+jxt , nbu+ncn , nbu+ncn+jca , nbu+ncn+jcm , nbu+xsn , nbu+xsn+jca , nbu+xsn+jca+jxc , nbu+xsn+jca+jxt , nbu+xsn+jcm , nbu+xsn+jco , nbu+xsn+jcs , nbu+xsn+jp+ecs , nbu+xsn+jp+ep+ef , nbu+xsn+jxc , nbu+xsn+jxc+jxt , nbu+xsn+jxt , nbu+xsv+ecc , nbu+xsv+etm , ncn , ncn+f+ncpa+jco , ncn+jca , ncn+jca+jca , ncn+jca+jcc , ncn+jca+jcj , ncn+jca+jcm , ncn+jca+jcs , ncn+jca+jct , ncn+jca+jp+ecc , ncn+jca+jp+ecs , ncn+jca+jp+ef , ncn+jca+jp+ep+ef , ncn+jca+jp+etm , ncn+jca+jp+etn+jxt , ncn+jca+jxc , ncn+jca+jxc+jcc , ncn+jca+jxc+jcm , ncn+jca+jxc+jxc , ncn+jca+jxc+jxt , ncn+jca+jxt , ncn+jcc , ncn+jcc+jxc , ncn+jcj , ncn+jcj+jxt , ncn+jcm , ncn+jco , ncn+jcr , ncn+jcr+jxc , ncn+jcs , ncn+jcs+jxt , ncn+jct , ncn+jct+jcm , ncn+jct+jxc , ncn+jct+jxt , ncn+jcv , ncn+jp+ecc , ncn+jp+ecc+jct , ncn+jp+ecc+jxc , ncn+jp+ecs , ncn+jp+ecs+jcm , ncn+jp+ecs+jco , ncn+jp+ecs+jxc , ncn+jp+ecs+jxt , ncn+jp+ecx , ncn+jp+ef , ncn+jp+ef+jca , ncn+jp+ef+jcm , ncn+jp+ef+jco , ncn+jp+ef+jcr , ncn+jp+ef+jcr+jxc , ncn+jp+ef+jcr+jxt , ncn+jp+ef+jp+etm , ncn+jp+ef+jxc , ncn+jp+ef+jxf , ncn+jp+ef+jxt , ncn+jp+ep+ecc , ncn+jp+ep+ecs , ncn+jp+ep+ecs+jxc , ncn+jp+ep+ecx , ncn+jp+ep+ef , ncn+jp+ep+ef+jcr , ncn+jp+ep+ef+jcr+jxc , ncn+jp+ep+ef+jxc , ncn+jp+ep+ef+jxf , ncn+jp+ep+ef+jxt , ncn+jp+ep+ep+etm , ncn+jp+ep+etm , ncn+jp+ep+etn , ncn+jp+ep+etn+jca , ncn+jp+ep+etn+jca+jxc , ncn+jp+ep+etn+jco , ncn+jp+ep+etn+jcs , ncn+jp+ep+etn+jxt , ncn+jp+etm , ncn+jp+etn , ncn+jp+etn+jca , ncn+jp+etn+jca+jxc , ncn+jp+etn+jca+jxt , ncn+jp+etn+jco , ncn+jp+etn+jcs , ncn+jp+etn+jct , ncn+jp+etn+jxc , ncn+jp+etn+jxt , ncn+jxc , ncn+jxc+jca , ncn+jxc+jca+jxc , ncn+jxc+jca+jxt , ncn+jxc+jcc , ncn+jxc+jcm , ncn+jxc+jco , ncn+jxc+jcs , ncn+jxc+jct+jxt , ncn+jxc+jp+ef , ncn+jxc+jp+ef+jcr , ncn+jxc+jp+ep+ecs , ncn+jxc+jp+ep+ef , ncn+jxc+jp+etm , ncn+jxc+jxc , ncn+jxc+jxt , ncn+jxt , ncn+jxt+jcm , ncn+jxt+jxc , ncn+nbn , ncn+nbn+jca , ncn+nbn+jcm , ncn+nbn+jcs , ncn+nbn+jp+ecc , ncn+nbn+jp+ep+ef , ncn+nbn+jxc , ncn+nbn+jxt , ncn+nbu , ncn+nbu+jca , ncn+nbu+jcm , ncn+nbu+jco , ncn+nbu+jp+ef , ncn+nbu+jxc , ncn+nbu+ncn , ncn+ncn , ncn+ncn+jca , ncn+ncn+jca+jcc , ncn+ncn+jca+jcm , ncn+ncn+jca+jxc , ncn+ncn+jca+jxc+jcm , ncn+ncn+jca+jxc+jxc , ncn+ncn+jca+jxt , ncn+ncn+jcc , ncn+ncn+jcj , ncn+ncn+jcm , ncn+ncn+jco , ncn+ncn+jcr , ncn+ncn+jcs , ncn+ncn+jct , ncn+ncn+jct+jcm , ncn+ncn+jct+jxc , ncn+ncn+jct+jxt , ncn+ncn+jp+ecc , ncn+ncn+jp+ecs , ncn+ncn+jp+ef , ncn+ncn+jp+ef+jcm , ncn+ncn+jp+ef+jcr , ncn+ncn+jp+ef+jcs , ncn+ncn+jp+ep+ecc , ncn+ncn+jp+ep+ecs , ncn+ncn+jp+ep+ef , ncn+ncn+jp+ep+ef+jcr , ncn+ncn+jp+ep+ep+etm , ncn+ncn+jp+ep+etm , ncn+ncn+jp+ep+etn , ncn+ncn+jp+etm , ncn+ncn+jp+etn , ncn+ncn+jp+etn+jca , ncn+ncn+jp+etn+jco , ncn+ncn+jp+etn+jxc , ncn+ncn+jxc , ncn+ncn+jxc+jca , ncn+ncn+jxc+jcc , ncn+ncn+jxc+jcm , ncn+ncn+jxc+jco , ncn+ncn+jxc+jcs , ncn+ncn+jxc+jxc , ncn+ncn+jxt , ncn+ncn+nbn , ncn+ncn+ncn , ncn+ncn+ncn+jca , ncn+ncn+ncn+jca+jcm , ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+jcj , ncn+ncn+ncn+jcm , ncn+ncn+ncn+jco , ncn+ncn+ncn+jcs , ncn+ncn+ncn+jct+jxt , ncn+ncn+ncn+jp+etn+jxc , ncn+ncn+ncn+jxt , ncn+ncn+ncn+ncn+jca , ncn+ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+ncn+jco , ncn+ncn+ncn+xsn+jp+etm , ncn+ncn+ncpa , ncn+ncn+ncpa+jca , ncn+ncn+ncpa+jcm , ncn+ncn+ncpa+jco , ncn+ncn+ncpa+jcs , ncn+ncn+ncpa+jxc , ncn+ncn+ncpa+jxt , ncn+ncn+ncpa+ncn , ncn+ncn+ncpa+ncn+jca , ncn+ncn+ncpa+ncn+jcj , ncn+ncn+ncpa+ncn+jcm , ncn+ncn+ncpa+ncn+jxt , ncn+ncn+xsn , ncn+ncn+xsn+jca , ncn+ncn+xsn+jca+jxt , ncn+ncn+xsn+jcj , ncn+ncn+xsn+jcm , ncn+ncn+xsn+jco , ncn+ncn+xsn+jcs , ncn+ncn+xsn+jct , ncn+ncn+xsn+jp+ecs , ncn+ncn+xsn+jp+ep+ef , ncn+ncn+xsn+jp+etm , ncn+ncn+xsn+jxc , ncn+ncn+xsn+jxc+jcs , ncn+ncn+xsn+jxt , ncn+ncn+xsv+ecc , ncn+ncn+xsv+etm , ncn+ncpa , ncn+ncpa+jca , ncn+ncpa+jca+jcm , ncn+ncpa+jca+jxc , ncn+ncpa+jca+jxt , ncn+ncpa+jcc , ncn+ncpa+jcj , ncn+ncpa+jcm , ncn+ncpa+jco , ncn+ncpa+jcr , ncn+ncpa+jcs , ncn+ncpa+jct , ncn+ncpa+jct+jcm , ncn+ncpa+jct+jxt , ncn+ncpa+jp+ecc , ncn+ncpa+jp+ecc+jxc , ncn+ncpa+jp+ecs , ncn+ncpa+jp+ecs+jxc , ncn+ncpa+jp+ef , ncn+ncpa+jp+ef+jcr , ncn+ncpa+jp+ef+jcr+jxc , ncn+ncpa+jp+ep+ef , ncn+ncpa+jp+ep+etm , ncn+ncpa+jp+ep+etn , ncn+ncpa+jp+etm , ncn+ncpa+jxc , ncn+ncpa+jxc+jca+jxc , ncn+ncpa+jxc+jco , ncn+ncpa+jxc+jcs , ncn+ncpa+jxt , ncn+ncpa+nbn+jcs , ncn+ncpa+ncn , ncn+ncpa+ncn+jca , ncn+ncpa+ncn+jca+jcm , ncn+ncpa+ncn+jca+jxc , ncn+ncpa+ncn+jca+jxt , ncn+ncpa+ncn+jcj , ncn+ncpa+ncn+jcm , ncn+ncpa+ncn+jco , ncn+ncpa+ncn+jcs , ncn+ncpa+ncn+jct , ncn+ncpa+ncn+jct+jcm , ncn+ncpa+ncn+jp+ef+jcr , `ncn+ncpa+ncn+jp+ep+et... |
ko_core_news_lg-3.7.0
Checksum .tar.gz:
2a3a2257342903b6a9edc658203010d2c26194083e0b13c2236ba3c2c39abc43
Checksum .whl:125f607b91778c97bd5be65dbcb5b14c0c1231c1f8998d1a652ed396c03c6945
Details: https://spacy.io/models/ko#ko_core_news_lg
Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.
Feature | Description |
---|---|
Name | ko_core_news_lg |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , tagger , morphologizer , parser , lemmatizer , attribute_ruler , ner |
Components | tok2vec , tagger , morphologizer , parser , lemmatizer , senter , attribute_ruler , ner |
Vectors | floret (200000, 300) |
Sources | UD Korean Kaist v2.8 (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol) KLUE v1.1.0 (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho) Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 220 MB |
Label Scheme
View label scheme (2028 labels for 4 components)
Component | Labels |
---|---|
tagger |
_SP , ecs , etm , f , f+f+jcj , f+f+jcs , f+f+jct , f+f+jxt , f+jca , f+jca+jp+ecc , f+jca+jp+ep+ef , f+jca+jxc , f+jca+jxc+jcm , f+jca+jxt , f+jcj , f+jcm , f+jco , f+jcs , f+jct , f+jct+jcm , f+jp+ef , f+jp+ep+ef , f+jp+etm , f+jxc , f+jxt , f+ncn , f+ncn+jcm , f+ncn+jcs , f+ncn+jp+ecc , f+ncn+jxt , f+ncpa+jcm , f+npp+jcs , f+nq , f+xsn , f+xsn+jco , f+xsn+jxt , ii , jca , jca+jcm , jca+jxc , jca+jxt , jcc , jcj , jcm , jco , jcr , jcr+jxc , jcs , jct , jct+jcm , jct+jxt , jp+ecc , jp+ecs , jp+ef , jp+ef+jcr , jp+ef+jcr+jxc , jp+ep+ecs , jp+ep+ef , jp+ep+etm , jp+ep+etn , jp+etm , jp+etn , jp+etn+jco , jp+etn+jxc , jxc , jxc+jca , jxc+jco , jxc+jcs , jxt , mad , mad+jxc , mad+jxt , mag , mag+jca , mag+jcm , mag+jcs , mag+jp+ef+jcr , mag+jxc , mag+jxc+jxc , mag+jxt , mag+xsn , maj , maj+jxc , maj+jxt , mma , mmd , nbn , nbn+jca , nbn+jca+jcj , nbn+jca+jcm , nbn+jca+jp+ef , nbn+jca+jxc , nbn+jca+jxt , nbn+jcc , nbn+jcj , nbn+jcm , nbn+jco , nbn+jcr , nbn+jcs , nbn+jct , nbn+jct+jcm , nbn+jct+jxt , nbn+jp+ecc , nbn+jp+ecs , nbn+jp+ecs+jca , nbn+jp+ecs+jcm , nbn+jp+ecs+jco , nbn+jp+ecs+jxc , nbn+jp+ecs+jxt , nbn+jp+ecx , nbn+jp+ef , nbn+jp+ef+jca , nbn+jp+ef+jco , nbn+jp+ef+jcr , nbn+jp+ef+jcr+jxc , nbn+jp+ef+jcr+jxt , nbn+jp+ef+jcs , nbn+jp+ef+jxc , nbn+jp+ef+jxc+jco , nbn+jp+ef+jxf , nbn+jp+ef+jxt , nbn+jp+ep+ecc , nbn+jp+ep+ecs , nbn+jp+ep+ecs+jxc , nbn+jp+ep+ef , nbn+jp+ep+ef+jcr , nbn+jp+ep+etm , nbn+jp+ep+etn , nbn+jp+ep+etn+jco , nbn+jp+ep+etn+jcs , nbn+jp+etm , nbn+jp+etn , nbn+jp+etn+jca , nbn+jp+etn+jca+jxt , nbn+jp+etn+jco , nbn+jp+etn+jcs , nbn+jp+etn+jxc , nbn+jp+etn+jxt , nbn+jxc , nbn+jxc+jca , nbn+jxc+jca+jxc , nbn+jxc+jca+jxt , nbn+jxc+jcc , nbn+jxc+jcm , nbn+jxc+jco , nbn+jxc+jcs , nbn+jxc+jp+ef , nbn+jxc+jxc , nbn+jxc+jxt , nbn+jxt , nbn+nbn , nbn+nbn+jp+ef , nbn+xsm+ecs , nbn+xsm+ef , nbn+xsm+ep+ef , nbn+xsm+ep+ef+jcr , nbn+xsm+etm , nbn+xsn , nbn+xsn+jca , nbn+xsn+jca+jp+ef+jcr , nbn+xsn+jca+jxc , nbn+xsn+jca+jxt , nbn+xsn+jcm , nbn+xsn+jco , nbn+xsn+jcs , nbn+xsn+jct , nbn+xsn+jp+ecc , nbn+xsn+jp+ecs , nbn+xsn+jp+ef , nbn+xsn+jp+ef+jcr , nbn+xsn+jp+ep+ef , nbn+xsn+jxc , nbn+xsn+jxt , nbn+xsv+etm , nbu , nbu+jca , nbu+jca+jxc , nbu+jca+jxt , nbu+jcc , nbu+jcc+jxc , nbu+jcj , nbu+jcm , nbu+jco , nbu+jcs , nbu+jct , nbu+jct+jxc , nbu+jp+ecc , nbu+jp+ecs , nbu+jp+ef , nbu+jp+ef+jcr , nbu+jp+ef+jxc , nbu+jp+ep+ecc , nbu+jp+ep+ecs , nbu+jp+ep+ef , nbu+jp+ep+ef+jcr , nbu+jp+ep+etm , nbu+jp+ep+etn+jco , nbu+jp+etm , nbu+jxc , nbu+jxc+jca , nbu+jxc+jcs , nbu+jxc+jp+ef , nbu+jxc+jp+ep+ef , nbu+jxc+jxt , nbu+jxt , nbu+ncn , nbu+ncn+jca , nbu+ncn+jcm , nbu+xsn , nbu+xsn+jca , nbu+xsn+jca+jxc , nbu+xsn+jca+jxt , nbu+xsn+jcm , nbu+xsn+jco , nbu+xsn+jcs , nbu+xsn+jp+ecs , nbu+xsn+jp+ep+ef , nbu+xsn+jxc , nbu+xsn+jxc+jxt , nbu+xsn+jxt , nbu+xsv+ecc , nbu+xsv+etm , ncn , ncn+f+ncpa+jco , ncn+jca , ncn+jca+jca , ncn+jca+jcc , ncn+jca+jcj , ncn+jca+jcm , ncn+jca+jcs , ncn+jca+jct , ncn+jca+jp+ecc , ncn+jca+jp+ecs , ncn+jca+jp+ef , ncn+jca+jp+ep+ef , ncn+jca+jp+etm , ncn+jca+jp+etn+jxt , ncn+jca+jxc , ncn+jca+jxc+jcc , ncn+jca+jxc+jcm , ncn+jca+jxc+jxc , ncn+jca+jxc+jxt , ncn+jca+jxt , ncn+jcc , ncn+jcc+jxc , ncn+jcj , ncn+jcj+jxt , ncn+jcm , ncn+jco , ncn+jcr , ncn+jcr+jxc , ncn+jcs , ncn+jcs+jxt , ncn+jct , ncn+jct+jcm , ncn+jct+jxc , ncn+jct+jxt , ncn+jcv , ncn+jp+ecc , ncn+jp+ecc+jct , ncn+jp+ecc+jxc , ncn+jp+ecs , ncn+jp+ecs+jcm , ncn+jp+ecs+jco , ncn+jp+ecs+jxc , ncn+jp+ecs+jxt , ncn+jp+ecx , ncn+jp+ef , ncn+jp+ef+jca , ncn+jp+ef+jcm , ncn+jp+ef+jco , ncn+jp+ef+jcr , ncn+jp+ef+jcr+jxc , ncn+jp+ef+jcr+jxt , ncn+jp+ef+jp+etm , ncn+jp+ef+jxc , ncn+jp+ef+jxf , ncn+jp+ef+jxt , ncn+jp+ep+ecc , ncn+jp+ep+ecs , ncn+jp+ep+ecs+jxc , ncn+jp+ep+ecx , ncn+jp+ep+ef , ncn+jp+ep+ef+jcr , ncn+jp+ep+ef+jcr+jxc , ncn+jp+ep+ef+jxc , ncn+jp+ep+ef+jxf , ncn+jp+ep+ef+jxt , ncn+jp+ep+ep+etm , ncn+jp+ep+etm , ncn+jp+ep+etn , ncn+jp+ep+etn+jca , ncn+jp+ep+etn+jca+jxc , ncn+jp+ep+etn+jco , ncn+jp+ep+etn+jcs , ncn+jp+ep+etn+jxt , ncn+jp+etm , ncn+jp+etn , ncn+jp+etn+jca , ncn+jp+etn+jca+jxc , ncn+jp+etn+jca+jxt , ncn+jp+etn+jco , ncn+jp+etn+jcs , ncn+jp+etn+jct , ncn+jp+etn+jxc , ncn+jp+etn+jxt , ncn+jxc , ncn+jxc+jca , ncn+jxc+jca+jxc , ncn+jxc+jca+jxt , ncn+jxc+jcc , ncn+jxc+jcm , ncn+jxc+jco , ncn+jxc+jcs , ncn+jxc+jct+jxt , ncn+jxc+jp+ef , ncn+jxc+jp+ef+jcr , ncn+jxc+jp+ep+ecs , ncn+jxc+jp+ep+ef , ncn+jxc+jp+etm , ncn+jxc+jxc , ncn+jxc+jxt , ncn+jxt , ncn+jxt+jcm , ncn+jxt+jxc , ncn+nbn , ncn+nbn+jca , ncn+nbn+jcm , ncn+nbn+jcs , ncn+nbn+jp+ecc , ncn+nbn+jp+ep+ef , ncn+nbn+jxc , ncn+nbn+jxt , ncn+nbu , ncn+nbu+jca , ncn+nbu+jcm , ncn+nbu+jco , ncn+nbu+jp+ef , ncn+nbu+jxc , ncn+nbu+ncn , ncn+ncn , ncn+ncn+jca , ncn+ncn+jca+jcc , ncn+ncn+jca+jcm , ncn+ncn+jca+jxc , ncn+ncn+jca+jxc+jcm , ncn+ncn+jca+jxc+jxc , ncn+ncn+jca+jxt , ncn+ncn+jcc , ncn+ncn+jcj , ncn+ncn+jcm , ncn+ncn+jco , ncn+ncn+jcr , ncn+ncn+jcs , ncn+ncn+jct , ncn+ncn+jct+jcm , ncn+ncn+jct+jxc , ncn+ncn+jct+jxt , ncn+ncn+jp+ecc , ncn+ncn+jp+ecs , ncn+ncn+jp+ef , ncn+ncn+jp+ef+jcm , ncn+ncn+jp+ef+jcr , ncn+ncn+jp+ef+jcs , ncn+ncn+jp+ep+ecc , ncn+ncn+jp+ep+ecs , ncn+ncn+jp+ep+ef , ncn+ncn+jp+ep+ef+jcr , ncn+ncn+jp+ep+ep+etm , ncn+ncn+jp+ep+etm , ncn+ncn+jp+ep+etn , ncn+ncn+jp+etm , ncn+ncn+jp+etn , ncn+ncn+jp+etn+jca , ncn+ncn+jp+etn+jco , ncn+ncn+jp+etn+jxc , ncn+ncn+jxc , ncn+ncn+jxc+jca , ncn+ncn+jxc+jcc , ncn+ncn+jxc+jcm , ncn+ncn+jxc+jco , ncn+ncn+jxc+jcs , ncn+ncn+jxc+jxc , ncn+ncn+jxt , ncn+ncn+nbn , ncn+ncn+ncn , ncn+ncn+ncn+jca , ncn+ncn+ncn+jca+jcm , ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+jcj , ncn+ncn+ncn+jcm , ncn+ncn+ncn+jco , ncn+ncn+ncn+jcs , ncn+ncn+ncn+jct+jxt , ncn+ncn+ncn+jp+etn+jxc , ncn+ncn+ncn+jxt , ncn+ncn+ncn+ncn+jca , ncn+ncn+ncn+ncn+jca+jxt , ncn+ncn+ncn+ncn+jco , ncn+ncn+ncn+xsn+jp+etm , ncn+ncn+ncpa , ncn+ncn+ncpa+jca , ncn+ncn+ncpa+jcm , ncn+ncn+ncpa+jco , ncn+ncn+ncpa+jcs , ncn+ncn+ncpa+jxc , ncn+ncn+ncpa+jxt , ncn+ncn+ncpa+ncn , ncn+ncn+ncpa+ncn+jca , ncn+ncn+ncpa+ncn+jcj , ncn+ncn+ncpa+ncn+jcm , ncn+ncn+ncpa+ncn+jxt , ncn+ncn+xsn , ncn+ncn+xsn+jca , ncn+ncn+xsn+jca+jxt , ncn+ncn+xsn+jcj , ncn+ncn+xsn+jcm , ncn+ncn+xsn+jco , ncn+ncn+xsn+jcs , ncn+ncn+xsn+jct , ncn+ncn+xsn+jp+ecs , ncn+ncn+xsn+jp+ep+ef , ncn+ncn+xsn+jp+etm , ncn+ncn+xsn+jxc , ncn+ncn+xsn+jxc+jcs , ncn+ncn+xsn+jxt , ncn+ncn+xsv+ecc , ncn+ncn+xsv+etm , ncn+ncpa , ncn+ncpa+jca , ncn+ncpa+jca+jcm , ncn+ncpa+jca+jxc , ncn+ncpa+jca+jxt , ncn+ncpa+jcc , ncn+ncpa+jcj , ncn+ncpa+jcm , ncn+ncpa+jco , ncn+ncpa+jcr , ncn+ncpa+jcs , ncn+ncpa+jct , ncn+ncpa+jct+jcm , ncn+ncpa+jct+jxt , ncn+ncpa+jp+ecc , ncn+ncpa+jp+ecc+jxc , ncn+ncpa+jp+ecs , ncn+ncpa+jp+ecs+jxc , ncn+ncpa+jp+ef , ncn+ncpa+jp+ef+jcr , ncn+ncpa+jp+ef+jcr+jxc , ncn+ncpa+jp+ep+ef , ncn+ncpa+jp+ep+etm , ncn+ncpa+jp+ep+etn , ncn+ncpa+jp+etm , ncn+ncpa+jxc , ncn+ncpa+jxc+jca+jxc , ncn+ncpa+jxc+jco , ncn+ncpa+jxc+jcs , ncn+ncpa+jxt , ncn+ncpa+nbn+jcs , ncn+ncpa+ncn , ncn+ncpa+ncn+jca , ncn+ncpa+ncn+jca+jcm , ncn+ncpa+ncn+jca+jxc , ncn+ncpa+ncn+jca+jxt , ncn+ncpa+ncn+jcj , ncn+ncpa+ncn+jcm , ncn+ncpa+ncn+jco , ncn+ncpa+ncn+jcs , ncn+ncpa+ncn+jct , ncn+ncpa+ncn+jct+jcm , ncn+ncpa+ncn+jp+ef+jcr , `ncn+ncpa+ncn+jp+ep+... |
ja_core_news_trf-3.7.2
Checksum .tar.gz:
c278a19f126a705584206df5b25c773ffb45509fd0df11e38f86e34206a691f9
Checksum .whl:85fb7bdb04bb7308ff8b728f6ecbceda198f1def857e6a922bd87ae089933d31
Details: https://spacy.io/models/ja#ja_core_news_trf
Japanese transformer pipeline (Transformer(name='cl-tohoku/bert-base-japanese-char-v2', piece_encoder='char', stride=160, type='bert', width=768, window=216, vocab_size=6144)). Components: transformer, morphologizer, parser, ner.
Feature | Description |
---|---|
Name | ja_core_news_trf |
Version | 3.7.2 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | transformer , morphologizer , parser , attribute_ruler , ner |
Components | transformer , morphologizer , parser , attribute_ruler , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel) UD Japanese GSD v2.8 NER (Megagon Labs Tokyo) cl-tohoku/bert-base-japanese-char-v2 (Inui Laboratory, Tohoku University) |
License | CC BY-SA 3.0 |
Author | Explosion |
Model size | 320 MB |
Label Scheme
View label scheme (64 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=NOUN , POS=ADP , POS=VERB , POS=SCONJ , POS=AUX , POS=PUNCT , POS=PART , POS=DET , POS=NUM , POS=ADV , POS=PRON , POS=ADJ , POS=PROPN , POS=CCONJ , POS=SYM , POS=NOUN|Polarity=Neg , POS=AUX|Polarity=Neg , POS=INTJ , POS=SCONJ|Polarity=Neg |
parser |
ROOT , acl , advcl , advmod , amod , aux , case , cc , ccomp , compound , cop , csubj , dep , det , dislocated , fixed , mark , nmod , nsubj , nummod , obj , obl , punct |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , MOVEMENT , NORP , ORDINAL , ORG , PERCENT , PERSON , PET_NAME , PHONE , PRODUCT , QUANTITY , TIME , TITLE_AFFIX , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
99.37 |
TOKEN_P |
97.64 |
TOKEN_R |
97.88 |
TOKEN_F |
97.76 |
POS_ACC |
97.94 |
MORPH_ACC |
0.00 |
MORPH_MICRO_P |
34.01 |
MORPH_MICRO_R |
98.04 |
MORPH_MICRO_F |
50.51 |
SENTS_P |
93.18 |
SENTS_R |
97.04 |
SENTS_F |
95.07 |
DEP_UAS |
93.05 |
DEP_LAS |
91.78 |
TAG_ACC |
97.13 |
LEMMA_ACC |
96.70 |
ENTS_P |
82.27 |
ENTS_R |
84.65 |
ENTS_F |
83.45 |
Installation
pip install spacy
python -m spacy download ja_core_news_trf
ja_core_news_sm-3.7.0
Checksum .tar.gz:
5cb1c87cd0551404a03fd630e824dfcb32e793f2ae5331a196d6e346749bdb2d
Checksum .whl:1191e5bbffcc90670146616c274a64850e54d12070bc5846e78a094f2f6fcfca
Details: https://spacy.io/models/ja#ja_core_news_sm
Japanese pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler.
Feature | Description |
---|---|
Name | ja_core_news_sm |
Version | 3.7.0 |
spaCy | >=3.7.0,<3.8.0 |
Default Pipeline | tok2vec , morphologizer , parser , attribute_ruler , ner |
Components | tok2vec , morphologizer , parser , senter , attribute_ruler , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel) UD Japanese GSD v2.8 NER (Megagon Labs Tokyo) |
License | CC BY-SA 4.0 |
Author | Explosion |
Model size | 11 MB |
Label Scheme
View label scheme (65 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
POS=NOUN , POS=ADP , POS=VERB , POS=SCONJ , POS=AUX , POS=PUNCT , POS=PART , POS=DET , POS=NUM , POS=ADV , POS=PRON , POS=ADJ , POS=PROPN , POS=CCONJ , POS=SYM , POS=NOUN|Polarity=Neg , POS=AUX|Polarity=Neg , POS=SPACE , POS=INTJ , POS=SCONJ|Polarity=Neg |
parser |
ROOT , acl , advcl , advmod , amod , aux , case , cc , ccomp , compound , cop , csubj , dep , det , dislocated , fixed , mark , nmod , nsubj , nummod , obj , obl , punct |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , MOVEMENT , NORP , ORDINAL , ORG , PERCENT , PERSON , PET_NAME , PHONE , PRODUCT , QUANTITY , TIME , TITLE_AFFIX , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
99.37 |
TOKEN_P |
97.64 |
TOKEN_R |
97.88 |
TOKEN_F |
97.76 |
POS_ACC |
96.13 |
MORPH_ACC |
0.00 |
MORPH_MICRO_P |
34.01 |
MORPH_MICRO_R |
98.04 |
MORPH_MICRO_F |
50.51 |
SENTS_P |
98.04 |
SENTS_R |
98.62 |
SENTS_F |
98.33 |
DEP_UAS |
91.95 |
DEP_LAS |
90.48 |
TAG_ACC |
97.13 |
LEMMA_ACC |
96.70 |
ENTS_P |
71.09 |
ENTS_R |
57.23 |
ENTS_F |
63.41 |
Installation
pip install spacy
python -m spacy download ja_core_news_sm