Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI snitch cluster #353

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Fix CI snitch cluster #353

wants to merge 1 commit into from

Conversation

rgantonio
Copy link

This PR prunes tests that are not applicable to the build for snitch cluster

Currently, there are several changes from the original PULP snitch_cluster. One in particular actually breaks our CI:

pulp-platform@379ae99

There are actually 2 options to move forward, since our development is different now from the original Snitch cluster:

  1. We use our own docker but only prune the tests that work. At least we check that it builds but we cannot test FP (note that we are not using the latest FP unit anymore. Due to not rebasing on top of their main, we cannot follow up with it).

  2. We take out the snitch_cluster from our CI. It doesn't make sense to have it tested on our side anyway because (1) we're not using Snitch cluster and (2) there are several changes in the original pulp snitch that are not added here. We have totally branched out and will never be rebasing from the main again.

With the tapeout at hand, I believe no RTL person has time to check and fix the Snitch cluster. So this is just a proposed fixed to make sure CI is clean.

@rgantonio rgantonio added bug Something isn't working clean up To keep repos clean labels Sep 28, 2024
@rgantonio rgantonio self-assigned this Sep 28, 2024
@IveanEx
Copy link

IveanEx commented Sep 28, 2024

I think you can still select the correct compiler and runtime to finish CI. See PR278?

@rgantonio
Copy link
Author

I think you can still select the correct compiler and runtime to finish CI. See PR278?

I actually tried this. Still wasn't working sadly haha

@IveanEx
Copy link

IveanEx commented Sep 29, 2024

I think you can still select the correct compiler and runtime to finish CI. See PR278?

I actually tried this. Still wasn't working sadly haha

It is very weird that the system works quite well and suddenly it fails😭😭

@rgantonio
Copy link
Author

I think you can still select the correct compiler and runtime to finish CI. See PR278?

I actually tried this. Still wasn't working sadly haha

It is very weird that the system works quite well and suddenly it fails😭😭

For one, it is indeed caused by the recent updates of the PULP people. The CI of the snitch cluster was using their pre-built docker container and they made changes.

I don't have time to investigate :( that's as far as I could get sadly.

I'll wait for @jorendumoulin and @JosseVanDelm 's thoughts hehe

Also, @jorendumoulin and @JosseVanDelm take note that... I don't think many of us have the bandwidth to look at these. We're a bit in a tight spot :')

@jorendumoulin
Copy link

Hmm that's annoying... A third option would be to fix the commit of the pulp docker build we're using to the last working one sha256:7e6603fbc524eab9b5136f233545302e30983762310bc71b1e1633bfc81ae924.

Although if we can run it on our own container without disabling any useful tests, this is even better in my opinion.
The only test that seems to be relevant that gets removed is sw/snitch-cluster-runtime.yaml. What does this test and is this not important to us?

@rgantonio
Copy link
Author

A third option would be to fix the commit of the pulp docker build we're using to the last working one

I think this is better if we still want to maintain their checks with the Snitch cluster. BUT! What doesn't make sense anymore is that we just froze it, continue with our own developments but not much related to what snitch cluster is anymore. Yet we also still have that Snitch cluster CI even though it's as if it's not being developed or used anymore. That's why I thought taking it out might be better too.

What does this test and is this not important to us?

I believe these ones have something to do with the top-level. Where clusters communicate with other clusters but this just checks signals coming from our clusters. @IveanEx modified other things (not sure which) to make it work on our Hemaia top.

@IveanEx do you think the tests in this yaml file are still relevant to us?

@JosseVanDelm
Copy link

We have totally branched out and will never be rebasing from the main again.

😢

Proposed change looks okay for me, although I do agree with @jorendumoulin, sw/snitch-cluster-runtime.yaml seems important to us?

@rgantonio
Copy link
Author

Do we have the go signal to push this? haha

@JosseVanDelm
Copy link

@rgantonio I do not understand your comment:

I believe these ones have something to do with the top-level. Where clusters communicate with other clusters but this just checks signals coming from our clusters. @IveanEx modified other things (not sure which) to make it work on our Hemaia top.

I don't understand why we would want to break these tests for the default case?

@IveanEx can you please clarify?

@IveanEx
Copy link

IveanEx commented Sep 30, 2024

@rgantonio I do not understand your comment:

I believe these ones have something to do with the top-level. Where clusters communicate with other clusters but this just checks signals coming from our clusters. @IveanEx modified other things (not sure which) to make it work on our Hemaia top.

I don't understand why we would want to break these tests for the default case?

@IveanEx can you please clarify?

I think the problems come from the recently updated docker image from ETH that broke the test. The failing tests were using the original snitch docker image instead of snax docker image. However, I also have no idea on why these works fails on snax docker image...

Copy link
Collaborator

@xiaoling-yi xiaoling-yi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Please solve the conflict!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working clean up To keep repos clean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants