-
Notifications
You must be signed in to change notification settings - Fork 78
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Replace README and allreduce test code.
- Loading branch information
Showing
3 changed files
with
643 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,66 +1,95 @@ | ||
<div align="center"> | ||
<img src="https://www.tensorflow.org/images/tf_logo_transp.png"><br><br> | ||
<img src="https://www.tensorflow.org/images/tf_logo_transp.png"> | ||
<img src="https://www.open-mpi.org/images/open-mpi-logo.png"> | ||
<br><br> | ||
</div> | ||
----------------- | ||
|
||
| **`Linux CPU`** | **`Linux GPU PIP`** | **`Mac OS CPU`** | **`Windows CPU`** | **`Android`** | | ||
|-----------------|---------------------|------------------|-------------------|---------------| | ||
| [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-cpu)](https://ci.tensorflow.org/job/tensorflow-master-cpu) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-gpu_pip)](https://ci.tensorflow.org/job/tensorflow-master-gpu_pip) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-mac)](https://ci.tensorflow.org/job/tensorflow-master-mac) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-win-cmake-py)](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-win-cmake-py) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-android)](https://ci.tensorflow.org/job/tensorflow-master-android) | | ||
# TensorFlow with MPI | ||
|
||
**TensorFlow** is an open source software library for numerical computation using | ||
data flow graphs. Nodes in the graph represent mathematical operations, while | ||
the graph edges represent the multidimensional data arrays (tensors) that flow | ||
between them. This flexible architecture lets you deploy computation to one | ||
or more CPUs or GPUs in a desktop, server, or mobile device without rewriting | ||
code. TensorFlow also includes TensorBoard, a data visualization toolkit. | ||
This repository contains a patched version of TensorFlow 0.12.1 which includes | ||
the `tensorflow.contrib.mpi` namespace with MPI operations, including a | ||
potentially CUDA-aware ring allreduce. | ||
|
||
TensorFlow was originally developed by researchers and engineers | ||
working on the Google Brain team within Google's Machine Intelligence research | ||
organization for the purposes of conducting machine learning and deep neural | ||
networks research. The system is general enough to be applicable in a wide | ||
variety of other domains, as well. | ||
## Installation | ||
|
||
**If you'd like to contribute to TensorFlow, be sure to review the [contribution | ||
guidelines](CONTRIBUTING.md).** | ||
Using this requires building TensorFlow from source with a CUDA-aware MPI of | ||
your choice, and has been tested with [OpenMPI](https://www.open-mpi.org/) | ||
integrated with [SLURM](https://slurm.schedmd.com/). | ||
|
||
**We use [GitHub issues](https://github.com/tensorflow/tensorflow/issues) for | ||
tracking requests and bugs, but please see | ||
[Community](tensorflow/g3doc/resources/index.md#community) for general questions | ||
and discussion.** | ||
Install by following the [TensorFlow source installation instructions](https://www.tensorflow.org/install/install_sources). | ||
When you run `configure`, you will be prompted for whether you would like to | ||
build TensorFlow with MPI, and, if so, what path your MPI installation is at. | ||
|
||
## Installation | ||
*See [Download and Setup](tensorflow/g3doc/get_started/os_setup.md) for instructions on how to install our release binaries or how to build from source.* | ||
Although it has only been tested with SLURM-integrated OpenMPI, it should also | ||
work with any other CUDA-aware MPI implementation. | ||
|
||
## Usage | ||
|
||
People who are a little more adventurous can also try our nightly binaries: | ||
The auto-generated documentation for TensorFlow includes usage examples. In | ||
addition, we include a TensorFlow language model that we use for benchmarking | ||
the allreduce in a real-world situation. In order to run the language model | ||
training, make sure you `pip install -r allreduce-requirements.txt` to install | ||
all Python dependencies. | ||
|
||
* Linux CPU-only: [Python 2](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.12.1-cp27-none-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave)) / [Python 3.4](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.12.1-cp34-cp34m-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/)) / [Python 3.5](https://ci.tensorflow.org/view/Nightly/job/nightly-python35-linux-cpu/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.12.1-cp35-cp35m-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-python35-linux-cpu/)) | ||
* Linux GPU: [Python 2](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-linux/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-linux/)) / [Python 3.4](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-linux/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow_gpu-0.12.1-cp34-cp34m-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-linux/)) / [Python 3.5](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3.5,label=gpu-linux/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow_gpu-0.12.1-cp35-cp35m-linux_x86_64.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-linux-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3.5,label=gpu-linux/)) | ||
* Mac CPU-only: [Python 2](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.12.1-py2-none-any.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/)) / [Python 3](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.12.1-py3-none-any.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/)) | ||
* Mac GPU: [Python 2](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-mac-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-mac/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow_gpu-0.12.1-py2-none-any.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-mac-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-mac/)) / [Python 3](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-mac-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-mac/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow_gpu-0.12.1-py3-none-any.whl) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-mac-gpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-mac/)) | ||
* [Android](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/lastSuccessfulBuild/artifact/bazel-out/local_linux/bin/tensorflow/examples/android/tensorflow_demo.apk) ([build history](https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/)) | ||
After that, you should be able to run `allreduce-test.py` with the appropriate | ||
training and validation datasets and vocabulary. We train on the Billion Words dataset, which | ||
is a text file with one sentence per line, as follows: | ||
|
||
#### *Try your first TensorFlow program* | ||
```shell | ||
$ python | ||
``` | ||
```python | ||
>>> import tensorflow as tf | ||
>>> hello = tf.constant('Hello, TensorFlow!') | ||
>>> sess = tf.Session() | ||
>>> sess.run(hello) | ||
Hello, TensorFlow! | ||
>>> a = tf.constant(10) | ||
>>> b = tf.constant(32) | ||
>>> sess.run(a+b) | ||
42 | ||
>>> | ||
... | ||
To Mo concerning the food log you kept -- Dr. Buchholz recommends the same thing . | ||
The CBO estimates that only 23 percent of that would be spent in 2009 and 2010 . | ||
Even so , Democrats slammed Bush as out of touch . | ||
An information campaign will be launched later to raise awareness of employment rights and how to enforce them . | ||
... | ||
``` | ||
|
||
##For more information | ||
The vocabulary file is a list of the top most common vocabulary words: | ||
|
||
``` | ||
<unk> | ||
the | ||
, | ||
. | ||
to | ||
of | ||
and | ||
a | ||
in | ||
" | ||
's | ||
that | ||
for | ||
on | ||
is | ||
The | ||
was | ||
with | ||
said | ||
as | ||
at | ||
... | ||
``` | ||
|
||
You should be able to run training with a command as follows: | ||
|
||
```bash | ||
# If you have SLURM with a CUDA-aware MPI integrated, you can use `srun` to | ||
# launch your job. Otherwise, you will need to use `mpirun` and appropriately | ||
# set `CUDA_VISIBLE_DEVICES` to choose which GPUs to use. | ||
srun --partition=K40x4 --ntasks=4 --gres=gpu:4 \ | ||
python allreduce-test.py \ | ||
--train-data train.txt \ | ||
--validation-data train.txt \ | ||
--vocab vocab.txt \ | ||
--vocab-size 10000 \ | ||
--batch-size 32 \ | ||
--max-iterations 10000 | ||
``` | ||
|
||
* [TensorFlow website](http://tensorflow.org) | ||
* [TensorFlow whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf) | ||
* [TensorFlow Model Zoo](https://github.com/tensorflow/models) | ||
* [TensorFlow MOOC on Udacity](https://www.udacity.com/course/deep-learning--ud730) | ||
## Support | ||
|
||
The TensorFlow community has created amazing things with TensorFlow, please see the [resources section of tensorflow.org](https://www.tensorflow.org/versions/master/resources#community) for an incomplete list. | ||
We do not offer any sort of official support or maintenance for this patch. | ||
However, if you would like to use it and run into trouble, feel free to [file a Github issue](https://github.com/baidu-research/tensorflow-allreduce/issues) | ||
and we may be able to help. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
numpy | ||
click |
Oops, something went wrong.