Skip to content

fasttext-wiki-news-subwords-300

Latest
Compare
Choose a tag to compare
@menshikh-iv menshikh-iv released this 16 Mar 12:50
· 1 commit to master since this release

Pre-trained FastText 1 million word vectors trained on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset (16B tokens).

Feature Description
File size 959MB
Number of vectors 999999
Dimension 300
License https://creativecommons.org/licenses/by-sa/3.0/

Read more:

Example

import gensim.downloader as api

model = api.load("fasttext-wiki-news-subwords-300")
model.most_similar(positive=["russia", "river"])

"""
Output:

[(u'russias', 0.6939424276351929),
 (u'danube', 0.6881916522979736),
 (u'river.', 0.6683923006057739),
 (u'crimea', 0.6638611555099487),
 (u'rhine', 0.6632323861122131),
 (u'rivermouth', 0.6602864265441895),
 (u'wester', 0.6586191058158875),
 (u'finland', 0.6585439443588257),
 (u'volga', 0.6576792001724243),
 (u'ukraine', 0.6569074392318726)]

"""