Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/develop' into distributed-post-v…
Browse files Browse the repository at this point in the history
…erification
  • Loading branch information
poszu committed Jan 25, 2024
2 parents e7753a6 + a09cf87 commit 2c926a6
Show file tree
Hide file tree
Showing 4 changed files with 185 additions and 62 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Docker Build and Push

on:
workflow_dispatch:
release:
types: [published]

jobs:
docker-build:
runs-on: ubuntu-latest
steps:
- name: Git Checkout
uses: actions/checkout@v4
with:
submodules: true

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Login to DockerHub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile
push: true
tags: spacemeshos/postcli:latest, spacemeshos/postcli:${{ GITHUB.SHA }}, spacemeshos/postcli:${{ github.ref_name }}
46 changes: 46 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
FROM golang:1.21 as builder
RUN set -ex \
&& apt-get update --fix-missing \
&& apt-get install -qy --no-install-recommends \
unzip sudo \
ocl-icd-opencl-dev

WORKDIR /src
COPY Makefile* .
RUN make get-postrs-lib

# We want to populate the module cache based on the go.{mod,sum} files.
COPY go.mod .
COPY go.sum .

RUN go mod download

# Here we copy the rest of the source code
COPY . .

# And compile the project
RUN --mount=type=cache,id=build,target=/root/.cache/go-build make build

FROM ubuntu:22.04 AS postcli
ENV DEBIAN_FRONTEND noninteractive
ENV SHELL /bin/bash
USER root
RUN set -ex \
&& apt-get update --fix-missing \
&& apt-get install -qy --no-install-recommends \
locales \
ocl-icd-libopencl1 clinfo \
pocl-opencl-icd libpocl2 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& locale-gen en_US.UTF-8 \
&& update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US.UTF-8
ENV LC_ALL en_US.UTF-8

# Finally we copy the statically compiled Go binary.
COPY --from=builder /src/build/postcli /bin/
COPY --from=builder /src/build/libpost.so /bin/

ENTRYPOINT ["/bin/postcli"]
73 changes: 51 additions & 22 deletions cmd/postcli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,25 @@ CLI tool for PoST initialization

## Getting it

Go to the <https://github.com/spacemeshos/post/releases> and take the most recent release for your platform. In case if you want to build it from source, follow the instructions below.
Go to the <https://github.com/spacemeshos/post/releases> and take the most recent release for your platform. In case if
you want to build it from source, follow the instructions below.

```bash
git clone https://github.com/spacemeshos/post
cd post
make postcli
```

Alternatively you can use the `postcli` docker image:

```bash
docker pull spacemeshos/postcli:<version>
```

Replace the version with the latest version. You can find the latest version on the
[releases page](https://github.com/spacemeshos/post/releases). Avoid using the `latest` tag since it can be
unstable/untested.

## Usage

```bash
Expand All @@ -20,15 +31,17 @@ make postcli

## Get OpenCL working

You need to have OpenCL support on your system. OpenCL usually comes with your graphics drivers. On Windows it should work out of the box on linux you will need to install them separately.
You need to have OpenCL support on your system. OpenCL usually comes with your graphics drivers. On Windows it should
work out of the box on linux you will need to install them separately.

You can always list the providers by using

```bash
clinfo -l
```

That's separate command NOT shipped with post implementation. Please refer to your system installation manual of clinfo for installation instructions.
That's separate command NOT shipped with post implementation. Please refer to your system installation manual of clinfo
for installation instructions.

### Nvidia

Expand Down Expand Up @@ -68,30 +81,34 @@ apt install nvidia-opencl-icd
Example

```bash
./postcli -provider=2 -id=c230c51669d1fcd35860131e438e234726b2bd5f9adbbd91bd88a718e7e98ecb -commitmentAtxId=c230c51669d1fcd35860131e438e234726b2bd5f9adbbd91bd88a718e7e98ecb -genproof
./postcli -provider=2 -id=c230c51669d1fcd35860131e438e234726b2bd5f9adbbd91bd88a718e7e98ecb \
-commitmentAtxId=c230c51669d1fcd35860131e438e234726b2bd5f9adbbd91bd88a718e7e98ecb -genproof
```

### Remarks

* Both `-id` and `-commitmentAtxId` are needed to generate the PoST data.
* If `-id` isn't provided a new identity will be auto-generated. Its private key will be stored in `key.bin` in `-datadir`
with the PoST data. This file then **must** to be copied/moved with the PoST data to run a node with this generated identity.
* If `-id` isn't provided a new identity will be auto-generated. Its private key will be stored in `key.bin` in
`-datadir` with the PoST data. This file then **must** to be copied/moved with the PoST data to run a node with this
generated identity.
**NOTE:** The generated PoST data is ONLY valid for this identity!
If a public key is provided with the `-id` flag, the `key.bin` file will be NOT created. Make sure that the key file that belongs
to the identity provided to `postcli` is available in the PoST directory **before** running a node with it.
If a public key is provided with the `-id` flag, the `key.bin` file will be NOT created. Make sure that the key file
that belongs to the identity provided to `postcli` is available in the PoST directory **before** running a node with it.
* `-commitmentAtxId`: it is recommended to look up the highest ATX by querying it from a synced node with
`grpcurl -plaintext -d '' 0.0.0.0:9093 spacemesh.v1.ActivationService.Highest | jq -r '.atx.id.id' | base64 -d | xxd -p -c 64`.
The node can be operated in "non-smeshing" mode during synchronization and when querying the highest ATX.
* The `-reset` flag can be used to clean up a previous initialization. **Careful**: This will delete data that won't be recoverable.
* The `-reset` flag can be used to clean up a previous initialization. **Careful**: This will delete data that won't be
recoverable.

## Initializing a subset of PoST data

It is possible to initialize only subset of the files. This feature is intended to be used to split initialization between many machines.
It is possible to initialize only subset of the files. This feature is intended to be used to split initialization
between many machines.

### Example - split initialization between 2 machines

For this example we initialize 100 units and split the process of initialization into two chunks. This command shows the number of files
that would be created during initialization:
For this example we initialize 100 units and split the process of initialization into two chunks. This command shows
the number of files that would be created during initialization:

```bash
./postcli -numUnits 100 -printNumFiles
Expand Down Expand Up @@ -156,36 +173,48 @@ in each `post_metadata.json` of every subset. Given two files:
"NonceValue": "0000488e171389cce69344d68b66f6b4"
```

The nonce in the second file (please see the `NonceValue` not `Nonce` field) is the global minimum since its value is smaller than the first one. The operator is **required** to find the
smallest VRF nonce by hand and ensure that its index and value are in the `postdata_metadata.json` of the merged directory on the target machine.
The nonce in the second file (please see the `NonceValue` not `Nonce` field) is the global minimum since its value is
smaller than the first one. The operator is **required** to find the smallest VRF nonce by hand and ensure that its
index and value are in the `postdata_metadata.json` of the merged directory on the target machine.

Not every chunk will contain a VRF nonce in its `postdata_metadata.json`, but at least one should. If for the very unlikely case that no VRF nonce
was found in any chunk the operator can run `postcli` again **after merging the data** without `-fromFile` and `-toFile` flags to find a VRF nonce.
Not every chunk will contain a VRF nonce in its `postdata_metadata.json`, but at least one should. If for the very
unlikely case that no VRF nonce was found in any chunk the operator can run `postcli` again **after merging the data**
without `-fromFile` and `-toFile` flags to find a VRF nonce.

## Verifying initialized POS data

The `postcli` allows verifying an already initialized POS data. Verification samples a small fraction of labels from every file and compares them to labels generated with the same algorithm executed on CPU. Please note that generating labels on CPU is slow compared to GPU. Hence it is not possible to verify all the data (it would essentially mean re-initialization on CPU). If the GPU failed during initialization, the created PoST data will contain some or all invalid labels after that point. This method will only sample the PoST and might not detect a small amount of corrupted data.
The `postcli` allows verifying an already initialized POS data. Verification samples a small fraction of labels from
every file and compares them to labels generated with the same algorithm executed on CPU. Please note that generating
labels on CPU is slow compared to GPU. Hence it is not possible to verify all the data (it would essentially mean
re-initialization on CPU). If the GPU failed during initialization, the created PoST data will contain some or all
invalid labels after that point. This method will only sample the PoST and might not detect a small amount of corrupted
data.

Depending on PoST size and CPU speed a reasonable *fraction* (%) parameter needs to be picked that gives enough confidence but still completes verification in a reasonable time. Suggested values are <1%, closer to 0.1%.
Depending on PoST size and CPU speed a reasonable *fraction* (%) parameter needs to be picked that gives enough
confidence but still completes verification in a reasonable time. Suggested values are <1%, closer to 0.1%.

To verify POS data:

1. locate the directory of the POS data. It should contain postdata_metadata.json and postdata_N.bin files.
2. run `postcli -verify -datadir <path to POS directory> -fraction <% of data to verify>`.

For example, `postcli -verify -datadir ~/post/data -fraction 0.1` will verify 0.1% of data. No additional arguments (i.e `-id`) are required. The postcli will read all required information from postdata_metadata.json
For example, `postcli -verify -datadir ~/post/data -fraction 0.1` will verify 0.1% of data. No additional arguments
(i.e `-id`) are required. The postcli will read all required information from postdata_metadata.json

If the POS data is found to be invalid, `postcli` will exit with status 1 and print the index of file and offset of the label found to be invalid. If verification completes successfully, `postcli` exits with 0.
If the POS data is found to be invalid, `postcli` will exit with status 1 and print the index of file and offset of the
label found to be invalid. If verification completes successfully, `postcli` exits with 0.

## Troubleshooting

### Searching for a lost VRF nonce

In case you lost a VRF nonce after merging initialized subsets, you can use postcli to recover it without re-initializing the data. Postcli will need to **read** the entire POS data and find the nonce.
In case you lost a VRF nonce after merging initialized subsets, you can use postcli to recover it without
re-initializing the data. Postcli will need to **read** the entire POS data and find the nonce.

To find a lost nonce:

1. locate the directory of the POS data. It should contain postdata_metadata.json and postdata_N.bin files.
2. run `postcli -searchForNonce -datadir <path to POS directory>`.

The postcli will read the metadata from postdata_metadata.json and then look for the nonce in all postdata_N.bin files one by one. If the nonce is found it will update the metadata file.
The postcli will read the metadata from postdata_metadata.json and then look for the nonce in all postdata_N.bin files
one by one. If the nonce is found it will update the metadata file.
Loading

0 comments on commit 2c926a6

Please sign in to comment.