Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run all framework sanity check tests and organize jobs. #18420

Merged
merged 11 commits into from
Sep 4, 2024

Conversation

ScottTodd
Copy link
Member

@ScottTodd ScottTodd commented Sep 3, 2024

Fixes #16624 by running the existing ONNX and PyTorch importer tests with the packages they need installed.

Sample logs when a test fails: https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:19

Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line 8, in <module>
    from iree.compiler.extras import fx_importer
  File "/home/runner/work/iree/iree/.venv/lib/python3.11/site-packages/iree/compiler/extras/fx_importer.py", line 138, in <module>
    from .._mlir_libs._torchMlir import get_int64_max, get_int64_min
ModuleNotFoundError: No module named 'iree.compiler._mlir_libs._torchMlir'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line [19](https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:20), in <module>
    raise ModuleNotFoundError(
ModuleNotFoundError: Failed to import the fx_importer (for a reason other than torch not being found)
Error: Process completed with exit code 1.

I'm not really satisfied with how these tests are distributed across jobs either before or after these changes, but I think this is a step in a good direction at least.

  • These tests depend on optional packages (torch, onnx, tensorflow) and disable themselves if those optional packages are not present.
  • The core project build (CMake/CTest, Python, packaging builds) strives to be modular and not require the entire kitchen sink to function.
  • Test workflows should make sense for both local development and CI usage. The local development flows here are relatively convoluted and could use some work.

@ScottTodd ScottTodd added infrastructure Relating to build systems, CI, or testing integrations/tensorflow TensorFlow model import and conversion bindings/python Python wrapping IREE's C API integrations/pytorch PyTorch integration work integrations/onnx ONNX integration work labels Sep 3, 2024
@ScottTodd ScottTodd marked this pull request as ready for review September 3, 2024 23:00
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file and pkgci_test_onnx.yml could be separated, but it made more sense to me to group them together. fx_importer_test and onnx_importer_test weren't running before because they didn't have a natural place to run. This gives them that place by moving around the other workflow jobs.

Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy from a naming perspective with having test like pkgci_test_amd_mi300.yml where the platform is what stands in the foreground and now having tests were a framework stands in the foreground and that runs on multiple hw platforms. Maybe we can rework this someday.

Anyway, definitely a step into the right direction 👍

Comment on lines 33 to +34
jobs:
cross_compile:
android_arm64:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that the Android and RISCV cross compile jobs are sharing a cache (named "cross_compile") when they shouldn't. Pushed another commit that gives each a unique cache key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy from a naming perspective with having test like pkgci_test_amd_mi300.yml where the platform is what stands in the foreground and now having tests were a framework stands in the foreground and that runs on multiple hw platforms. Maybe we can rework this someday.

Anyway, definitely a step into the right direction 👍

Agreed. Currently, some large chunks of code are copy/pasted, and grouping by platform would make that worse. I had ideas for making that easier on nod-ai/SHARK-TestSuite#288.

@ScottTodd ScottTodd merged commit 854a13e into iree-org:main Sep 4, 2024
36 of 37 checks passed
@ScottTodd ScottTodd deleted the infra-python-tests branch September 4, 2024 16:08
IanWood1 pushed a commit to IanWood1/iree that referenced this pull request Sep 8, 2024
Fixes iree-org#16624 by running the
existing ONNX and PyTorch importer tests _with the packages they need
installed_.

Sample logs when a test fails:
https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:19
```
Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line 8, in <module>
    from iree.compiler.extras import fx_importer
  File "/home/runner/work/iree/iree/.venv/lib/python3.11/site-packages/iree/compiler/extras/fx_importer.py", line 138, in <module>
    from .._mlir_libs._torchMlir import get_int64_max, get_int64_min
ModuleNotFoundError: No module named 'iree.compiler._mlir_libs._torchMlir'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line [19](https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:20), in <module>
    raise ModuleNotFoundError(
ModuleNotFoundError: Failed to import the fx_importer (for a reason other than torch not being found)
Error: Process completed with exit code 1.
```

---

I'm not really satisfied with how these tests are distributed across
jobs either before or after these changes, but I think this is a step in
a good direction at least.
* These tests depend on optional packages (torch, onnx, tensorflow) and
disable themselves if those optional packages are not present.
* The core project build (CMake/CTest, Python, packaging builds) strives
to be modular and not require the entire kitchen sink to function.
* Test workflows should make sense for both local development _and_ CI
usage. The local development flows here are relatively convoluted and
could use some work.
josemonsalve2 pushed a commit to josemonsalve2/iree that referenced this pull request Sep 14, 2024
Fixes iree-org#16624 by running the
existing ONNX and PyTorch importer tests _with the packages they need
installed_.

Sample logs when a test fails:
https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:19
```
Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line 8, in <module>
    from iree.compiler.extras import fx_importer
  File "/home/runner/work/iree/iree/.venv/lib/python3.11/site-packages/iree/compiler/extras/fx_importer.py", line 138, in <module>
    from .._mlir_libs._torchMlir import get_int64_max, get_int64_min
ModuleNotFoundError: No module named 'iree.compiler._mlir_libs._torchMlir'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/runner/work/iree/iree/compiler/bindings/python/test/extras/fx_importer_test.py", line [19](https://github.com/iree-org/iree/actions/runs/10691656074/job/29638920091?pr=18420#step:9:20), in <module>
    raise ModuleNotFoundError(
ModuleNotFoundError: Failed to import the fx_importer (for a reason other than torch not being found)
Error: Process completed with exit code 1.
```

---

I'm not really satisfied with how these tests are distributed across
jobs either before or after these changes, but I think this is a step in
a good direction at least.
* These tests depend on optional packages (torch, onnx, tensorflow) and
disable themselves if those optional packages are not present.
* The core project build (CMake/CTest, Python, packaging builds) strives
to be modular and not require the entire kitchen sink to function.
* Test workflows should make sense for both local development _and_ CI
usage. The local development flows here are relatively convoluted and
could use some work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bindings/python Python wrapping IREE's C API infrastructure Relating to build systems, CI, or testing integrations/onnx ONNX integration work integrations/pytorch PyTorch integration work integrations/tensorflow TensorFlow model import and conversion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run iree-import-onnx test(s) on CI
3 participants