Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Message Passing Interface library #195

Closed
daphne-eu opened this issue Feb 27, 2022 · 2 comments
Closed

Add Message Passing Interface library #195

daphne-eu opened this issue Feb 27, 2022 · 2 comments
Assignees

Comments

@daphne-eu
Copy link
Owner

In GitLab by @gpoerwawinata on Feb 27, 2022, 21:52

We are planning to add MPI to support gRPC based communication. For now, the targets that we would like to achieve by this issue are:

  1. Adding MPI library to be compiled in DAPHNE.
  2. Implement vectorised data communication with MPI.
  3. Adding MPI test case to see whether the MPI is properly working as distributed systems communication.
@daphne-eu
Copy link
Owner Author

In GitLab by @pdamme on Feb 28, 2022, 11:45

Hi @gpoerwawinata, I agree that support for MPI in the prototype would be great (as we have discussed before in the WP4/5 meetings). To keep everyone informed, it would be fantastic if you could provide a little description of this issues, especially a short overview of what you're planning to do.

@daphne-eu
Copy link
Owner Author

In GitLab by @gpoerwawinata on Feb 28, 2022, 13:33

@pdamme Sorry I missed this! I have added the issue's description based on my current plan. I will update the description again after I verify with Ahmed/Vasilis. Thank you for your suggestions.

@daphne-eu daphne-eu self-assigned this Mar 31, 2022
EricMier pushed a commit that referenced this issue Mar 28, 2023
* [MINOR} DenseMatrix<int64_t> output for vectorizedPipelineOp

* [BUGFIX] Wrong row offsets while slicing in vectorized task

Changing rowStart and rowEnd in accumulation of vectorized task results caused an out of bounds error when accessing a matrix. Passing altered row start/end values as temporaries fixes this.

* [BUGFIX] Workaround casting error occurring in vectorized cuda

* [MINOR] DAPHNE Binary Format Metadata

When writing in DAPHNE binary format, the *.dbdf.meta file was not written (as is the case with CSV).

* [DAHPNE-#198] Second order function map (#407)

- a first working version of the `map` built-in function in DaphneDSL
- applies a given unary UDF to each element of a DenseMatrix
- the UDF is passed as a function pointer to the map-kernel
- unit tests and script-level tests for map()
- Closes #198

* [MINOR] Fix compilation on modern gcc

* [DAPHNE-451] FPGA GEMV Operation

This commit adds support for a single precision GEMV operation to the list of supported FPGA operations. The necessary bitstream can be downloaded from the supplemental-binaries git repository (https://github.com/daphne-eu/supplemental-binaries). This does not represent a separate DAPHNE kernel but branches out to call the appropriate function in the Matmult kernel.

* [DAPHNE-451] Initial FPGA SYRK integration

This is a follow up that is merged in the same pull request (hence the same issue number).
Adding preliminary syrk support for fp32 (operation will see more work in the future).

Closes #451

* [DAPHNE-#453] Build from source artifact

This change enables the build.sh script to compile the third party dependencies without relying on the git submodule check out of llvm.

Closes #453

* [DAPHNE-#459] Enable CI

- prepares an ubuntu base image with required dependencies
- builds the daphne system
- runs test suite
- stores created artifacts
- simpler install of cmake
- run build on 20.04
- run only on pushes and PRs to main
- and manually via workflow dispatch

Closes #459, Closes #460

* Upgraded MLIR. (#468)

This commit upgrades MLIR from llvm/llvm-project@4763c8c (May 14, 2021) to llvm/llvm-project@20d454c (Jan 31, 2023).

The upgrade required a lot of changes in the DAPHNE code base, which are summarized below:
- Several things were renamed in MLIR (in some cases with small API changes):
  - C++
    - builder.getIdentifier() -> builder.getStringAttr()
    - FunctionPass -> OperationPass<func::FuncOp>
    - Likewise, runOnFunction() -> runOnOperation()
    - OwningRewritePatternList -> RewritePatternSet
    - OwningModuleRef -> OwningOpRef<ModuleOp>
    - BlockAndValueMapping -> IRMapping
  - TableGen
    - OpTrait -> Trait
    - NoSideEffect -> Pure
- Tablegen'erted names of operations' operands/attributes/results/regions all start with "get" now and are in camelCase.
- Several header files were moved in MLIR.
- A few things were refactored in MLIR:
  - Passing of options to execution engine.
  - Signature of matchAndRewrite() changed: instead of ArrayRef<Value> operands, it now gets a OpAdaptor adaptor.
  - Several places that used to require an IndexAttr now need an I64IntegerAttr.
  - Instead of mlir::ConstantIntOp and mlir::ConstantFloatOp, only mlir::arith::ConstantOp.
  - Type parsing and printing was changed. We even need a workaround now.
  - daphne::ConstantOp cannot have a custom builder with an attribute as the parameter anymore, since attributes do not have a type anymore.
  - Folding interfaces.
  - scf::IfOp builder does not accept result types anymore.
  - The StandardOps dialect was replaced by several other dialects, e.g. Arith, Func, ControlFlow.
- Needed to add some extra (existing MLIR) passes to the DAPHNE compilation chain:
  - A pass for creating C wrappers.
  - A pass for eliminating unrealized conversion casts (we introduce some explicitly when lowering to LLVM now).
- Some things in DAPHNE didn't work anymore, needed to be fixed:
  - Lowering to LLVM: kernel calls shouldn't have the return type void, but they should simply have an empty vector of return types.
  - Lowering to LLVM: special treatment of ReturnOp in VectorizedPipelineOp via UnrealizedConversionCast was necessary (maybe there are better solutions).

Some useful resources related to some of these changes:
- https://discourse.llvm.org/t/psa-update-your-opconversionpattern-matchandrewrite-methods/4354
- https://reviews.llvm.org/D110293
- https://reviews.llvm.org/D110293#change-lfVb579kxHdZ
- llvm/llvm-project@610139d
- https://discourse.llvm.org/t/psa-new-improved-fold-method-signature-has-landed-please-update-your-downstream-projects/67618

* [BUGFIX] Fix build dependency

Move daphneir before compiler/utils to fix error that seems to occur only when compiling with a low thread count (github action compiles with 2 threads and there the issue surfaced)

* [DAPHNE-#470] Cache build dependencies to speed up github action  compilation

* trigger on push to branch
* fix work dir permissions after code checkout in github action
* thirdparty dir caching
* check for existence of submodule directory in .git for llvm-project
* fix build.sh error testing for .git
* improve build.sh -nf
* only checkout submodules if .git/modules exists
* main.yml debug output

Closes #470

* [MINOR] Switch off fancy in non-terminal environment

...and protect that tput call.

* [MINOR] build.sh parameter for installPrefix

Set installPrefix from command line with --installPrefix=/my/directory

* [MINOR] Install MLIR in installPrefix

This change adds the invocation of the installation target while compiling LLVM/MLIR. With that, we can omit specifying LLVM_DIR and MLIR_DIR explicitly. Furthermore, this cuts the dependency of buildPrefix, which might not contain a meaningful value in a container with prebuild 3rd party dependencies. Finally, this enables us to switch the directory of installed dependencies with the installPrefix parameter of build.sh.

* [MINOR] Flag to make dependency compilation optional

Dependency building will be omitted if the --no-deps flag is provided to build.sh.
This is particularly useful in combination with --installPrefix for containers which have the deps already installed somewhere else (coming soon), e.g., ./build.sh --no-deps --installPrefix /usr/local

* [DAPHNE-#466] Change behavior of build.sh --clean

* --clean now only cleans the build output of DAPHNE and not the third party stuff. See the Github issue #466 for further details.
* reverting a workaround introduced by #470 (dir check before updating submodules)
* documentation updates

Closes #466, Closes #471

* [MINOR] Change Github Action cache key to exact hash

Minor fixes by Patrick Damme <[email protected]>

- build.sh
  - removed duplicate echo
  - reset the flag for accepting with yes after call
  - line break in message: line shall not start with whitespace
- README.md
  - wording
- BuildingDaphne.md
  - escaped some "<" by "\<"
  - added some missing ">"
  - corrected default install prefix

Co-authored-by: Patrick Damme <[email protected]>

* [CLEANUP] Reformat build.sh and remove warnings

This commit contains the result of a code formatter set to 4 spaces of indentation. Furthermore, all shell-check warnings have been resolved.
Furthermore, the --no-deps parameter got an abbreviation and is now also available as -nd

* [DAPHNE-#473] Improved Apache Arrow support

Cleaned up the integration of Apache Arrow for Parquet reader support.
* Arrow is now always built from 11.0 release package
* CMake code is now free from hard coded paths
* It seems we can drop the requirement to install boost libraries as everything compiles just fine without it, and we don't use much Arrow anyway atm

Closes #473

* [BUGFIX] Resolving circular build dependency

Still trying to fix a build error when building on GitHub Actions where CompilerUtils depends on some tablegen generated headers and vice versa.

* [MINOR] Fix git submodule loaded from gh action cache

Another fix for this issue =)

* [MINOR] Github Action: avoid test.sh and dependency building

* running tests without test.sh, as this does not support passing arguments to build.sh yet.
* If test.sh is used again in the future, run without piping output to ''ts'' as this swallows the exit code
* using a docker image with prebuilt dependencies now
* no need to build third party dependencies anymore as this is coming from the docker image
* deactivates the previously introduced caching of the third party dependency build

* [DAPHNE-#472, DAPHNE-#192] New Daphne Docker Images

* This is an initial version of our official docker images for various purposes. Atm it contains an image for running our GitHub action more efficiently (skipping dependency building) and an interactive development environment (also containing precompiled deps).
* Shell scripts are provided to build and run these containers.
* Use the run scripts to bind-mount your local DAPHNE source tree and avoid permission issues (commands in the container will be run as your local user to not end up with files that belong to root)
* A longer standing issue of missing numpy is also resolved now (issue #192)
* Initial dockermentation in containers/Readme.md

Closes #472, Closes #192

* [MINOR] Fix a path in submodule checkout

This one fixes a fix (from f314fdd) where the full path to the .git file is needed (and was not supplied)

* [MINOR] Fixed DistributedWrapper, could not parse MLIR code.

DistributedWrapper::getPipelineInputTypes could not parse the MLIR fragment because FuncDialect was not registered.

Closes issue 475.

* [BUGFIX] Test case failed to trigger distributed execution.

- The `--vec` flag used to be sufficient to trigger distributed execution if the environment variable `DISTRIBUTED_WORKERS` is set.
- However, meanwhile, this behavior was changed, now `--distributed` must be given, but the test case didn't reflect this change.
- Thus, the test case was executed purely locally, thereby hiding any bugs in the distributed execution (see #475).
- This commit fixes this bug.

* [MINOR] Added a link in the DaphneDSL lang ref.

* [DAPHNE-#132] Frame column labels after (#478)

- Modified the labeling for the second column of the group-join operation to reflect the summation.
- Adapted the frame label inference for consistency.
- Modified the test case to reflect the changes.
- Closes #132.

* [MINOR] Updated broadcast kernel assertion.

Broadcast kernel has an assertion that checks if the matrix/value is nullptr, but in case value 0 was broadcasted the assertion still failed. Updated so that assertion occurs only when we broadcast a matrix.

* [MINOR] Change Github Action Docker Image

Changing the docker image of our CI back to daphneeu/github-action to include pre-main third party dependencies.

* [DAPHNE-195] Distributed Ops using OpenMPI

This commit adds OpenMPI as an alternative way of running distributed operations (besides gRPC).

Closes #195, Closes #436

* [DAPHNE-#481] Right indexing rows by bit vector for DenseMatrix.

- Harmonized parsing of right indexing for extracting/slicing and filtering.
- Added partial specialization of filterRow-kernel for DenseMatrix.
- Added unit tests and script-level tests.
- Adapted the user documentation.
- Contributes to #481, but indexing columns by bit vector is still an open todo.

* [DAPHNE-#481] Right indexing columns by bit vector for DenseMatrix.

- New operation FilterColOp with some necessary inferences.
- New kernel filterCol with a partial specialization for DenseMatrix.
- Parsing of right indexing supports bit vectors for the columns.
- Unit test cases and script-level test cases.
- Updated the DaphneDSL language reference in the documentation.
- Closes #481.

* [MINOR] Fixed two typos in error messages.

* [MINOR] Removed a todo related to a closed issue.

* [MINOR] Updated DaphneDSL reference on right indexing.

- Simplified it by showing how matrix literals can be used to specify the positions.
- Added a remark on a required whitespace.

* [BUGFIX] genGivenVals() with zero rows.

- genGivenVals() used to crash if zero rows are specified.
- Now, it throws an exception.
- Note that, while matrices with 0 rows are allowed, using genGivenVals() to create then doesn't make sense (see comment in the code).

---------

Co-authored-by: Mark Dokter <[email protected]>
Co-authored-by: Simeon <[email protected]>
Co-authored-by: pratuszniak <[email protected]>
Co-authored-by: Benjamin Steinwender <[email protected]>
Co-authored-by: Patrick Damme <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: DamianDinoiu <[email protected]>
Co-authored-by: Ahmed Eleliemy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant