Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][Streaming Generator] Fix the perf regression from a serve handle bug fix #38171 #38280

Merged
merged 14 commits into from
Aug 13, 2023

Conversation

rkooo567
Copy link
Contributor

@rkooo567 rkooo567 commented Aug 10, 2023

Why are these changes needed?

Before #37972, we ran the reporting & serilization output (in cpp) in a main thread while all the async actor tasks run in an async thread. However, after the PR, we now run both of them in an async thread.

This caused regression when there are decently large size (200~2KB) generator workloads (Aviary) because the serialization code was running with nogil. It means we could utilize real multi-threading because serialization code runs in a main thread, and async actor code runs in an async thread.

This PR fixes the issue by dispatching a cpp code (reporting & serialization) to a separate thread again. I also found when I used threadPoolExecutor, there were some circular dependencies issues where it leaks objects when exceptions happen. I realized this was due to the fact that Python exception captures the local references (thus there were some circular references). I refactored some part of code to avoid this from happening and added an unit test for that.

Related issue number

Closes #38163

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@rkooo567 rkooo567 changed the title Different approach [Core][Streaming Generator] Fix the perf regression from a serve handle bug fix #38171 Aug 10, 2023
@@ -1012,7 +1013,7 @@ cdef class StreamingGeneratorExecutionContext:
return self


cpdef report_streaming_generator_output(
cdef report_streaming_generator_output(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be cpdef, and cdef is faster

except StopIteration:
return True
except Exception as e:
if isinstance(output_or_exception, Exception):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I raise an exception again like before, it caused the circular dependencies issues.

# the output (which has nogil).
done = await loop.run_in_executor(
worker.core_worker.get_thread_pool_for_async_event_loop(),
report_streaming_generator_output,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the core fix (dispatching the cpp code to another thread)

def get_thread_pool_for_async_event_loop(self):
if self.thread_pool_for_async_event_loop is None:
self.thread_pool_for_async_event_loop = ThreadPoolExecutor(
max_workers=int(os.getenv("RAY_ASYNC_THREAD_POOL_SIZE", 1)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: maybe I should always use 1 thread? I used multiple for some exploration. I think when there are lots of async tasks, it may have some perf benefits

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's thread safe to use multiple threads? Before it's done in main thread so it's 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually a good point. I think it should be thread-safe, but not 100% sure. Maybe I will just use 1 thread for now

@rkooo567
Copy link
Contributor Author

FAiled tests seem unrelated

@rkooo567 rkooo567 merged commit 60995d7 into ray-project:master Aug 13, 2023
1 of 2 checks passed
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
…le bug fix ray-project#38171 (ray-project#38280)

Before ray-project#37972, we ran the reporting & serilization output (in cpp) in a main thread while all the async actor tasks run in an async thread. However, after the PR, we now run both of them in an async thread.

This caused regression when there are decently large size (200~2KB) generator workloads (Aviary) because the serialization code was running with nogil. It means we could utilize real multi-threading because serialization code runs in a main thread, and async actor code runs in an async thread.

This PR fixes the issue by dispatching a cpp code (reporting & serialization) to a separate thread again. I also found when I used threadPoolExecutor, there were some circular dependencies issues where it leaks objects when exceptions happen. I realized this was due to the fact that Python exception captures the local references (thus there were some circular references). I refactored some part of code to avoid this from happening and added an unit test for that.

Signed-off-by: NripeshN <[email protected]>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…le bug fix ray-project#38171 (ray-project#38280)

Before ray-project#37972, we ran the reporting & serilization output (in cpp) in a main thread while all the async actor tasks run in an async thread. However, after the PR, we now run both of them in an async thread.

This caused regression when there are decently large size (200~2KB) generator workloads (Aviary) because the serialization code was running with nogil. It means we could utilize real multi-threading because serialization code runs in a main thread, and async actor code runs in an async thread.

This PR fixes the issue by dispatching a cpp code (reporting & serialization) to a separate thread again. I also found when I used threadPoolExecutor, there were some circular dependencies issues where it leaks objects when exceptions happen. I realized this was due to the fact that Python exception captures the local references (thus there were some circular references). I refactored some part of code to avoid this from happening and added an unit test for that.

Signed-off-by: e428265 <[email protected]>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…le bug fix ray-project#38171 (ray-project#38280)

Before ray-project#37972, we ran the reporting & serilization output (in cpp) in a main thread while all the async actor tasks run in an async thread. However, after the PR, we now run both of them in an async thread.

This caused regression when there are decently large size (200~2KB) generator workloads (Aviary) because the serialization code was running with nogil. It means we could utilize real multi-threading because serialization code runs in a main thread, and async actor code runs in an async thread.

This PR fixes the issue by dispatching a cpp code (reporting & serialization) to a separate thread again. I also found when I used threadPoolExecutor, there were some circular dependencies issues where it leaks objects when exceptions happen. I realized this was due to the fact that Python exception captures the local references (thus there were some circular references). I refactored some part of code to avoid this from happening and added an unit test for that.

Signed-off-by: Victor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants