Parallel execution is not functioning correctly with the 64-core arm runner. #136631
-
Select Topic AreaQuestion Bodydetail: We have verified that this issue is not related to our server or other hardware specifications. Could you please advise on how to address this problem? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 2 replies
-
You should try to decrease the total number in parallel way of working from eighty to a smaller figure, for instance forty or fifty; it is possible that this will help prevent overloading either the resources available for running it or its network, which are probably responsible for such prolonged periods with no response (timeouts). Continue adjusting the processes until you reach an optimal balance between them without having any more of these errors. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your insights. You are absolutely correct that reducing the number of processes will lead to less latency. However, our expectation was that the runner’s performance would be quadrupled (along with the fee), which would reduce the execution time to one-quarter. |
Beta Was this translation helpful? Give feedback.
-
Are there any other approaches we might consider? For example, would it be feasible to set up two RUNNERs, each handling 40 processes? Specifically, one RUNNER with 64 cores and 40 processes, and another with 64 cores and 40 processes, resulting in a total of 80 processes. This approach would be applicable only if network performance is influenced by each RUNNER individually. |
Beta Was this translation helpful? Give feedback.
-
Yes, it is reasonable to have two runners with 64 cores each that work and process loads per 40 processes This could alleviate some of the congestion and prevent a single runner from tying up an entire network/realm having to run 80 processes. If it's true that network performance is influenced by each runner individually, you should expect to have better executions of your tests without timeouts. |
Beta Was this translation helpful? Give feedback.
-
I will test this idea next week. If it works as expected, I will mark the above response accordingly. |
Beta Was this translation helpful? Give feedback.
-
This approach proved to be effective as anticipated. By dividing the jobs (runners), we were able to allocate the maximum network resources to each job. This allowed us to successfully run 40 processes in parallel across 4 jobs on a 64-core system. |
Beta Was this translation helpful? Give feedback.
Yes, it is reasonable to have two runners with 64 cores each that work and process loads per 40 processes This could alleviate some of the congestion and prevent a single runner from tying up an entire network/realm having to run 80 processes. If it's true that network performance is influenced by each runner individually, you should expect to have better executions of your tests without timeouts.