Building word2vec

The instructions provided below specify the steps to build word2vec version 0.1c on Linux on IBM Z for following distributions:

RHEL (7.8, 7.9, 8.4, 8.6, 9.0)
SLES (12 SP5, 15 SP3, 15 SP4)
Ubuntu (18.04, 20.04, 22.04)

General notes:

When following the steps below please use a standard permission user unless otherwise specified.
A directory /<source_root>/ will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.

Build word2vec

Install standard utilities, packages and platform specific dependencies

RHEL (7.8, 7.9, 8.4, 8.6, 9.0)

 sudo yum install -y gcc make wget tar unzip

SLES (12 SP5, 15 SP3, 15 SP4)

 sudo zypper install -y gcc make wget tar unzip

Ubuntu (18.04, 20.04, 22.04)

 sudo apt-get update
 sudo apt-get install -y gcc make wget tar unzip

Create a working directory and download word2vec source code

 cd $SOURCE_ROOT
 wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip
 unzip source-archive.zip

Build word2vec

 cd word2vec/trunk
 make CFLAGS="-lm -pthread -O3 -Wall -funroll-loops"

Set environment variables

 export PATH=$PATH:$SOURCE_ROOT/word2vec/trunk

Test word2vec using demo scripts
```
 ./demo-word.sh
 ./demo-phrases.sh
```
Note: Enter test corpus as input and get word vectors as output, e.g. Input=france
Run word2vec binary
```
 word2vec
```
Note: The word2vec tool takes a text corpus as input and produces the word vectors as output.

References:

https://code.google.com/archive/p/word2vec/

The information provided in this article is accurate at the time of writing, but on-going development in the open-source projects involved may make the information incorrect or obsolete. Please open issue or contact us on IBM Z Community if you have any questions or feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building word2vec

Building word2vec

Build word2vec

References:

Clone this wiki locally